#  @angeloskath Angelos Katharopoulos
Angelos Katharopoulos posts on X about model, code, ultra, faster the most. They currently have [-----] followers and [---] posts still getting attention that total [-----] engagements in the last [--] hours.
### Engagements: [-----] [#](/creator/twitter::874169451356278784/interactions)

- [--] Week [-----] -42%
- [--] Month [------] +551%
- [--] Months [------] +617%
- [--] Year [------] -55%
### Mentions: [--] [#](/creator/twitter::874169451356278784/posts_active)

### Followers: [-----] [#](/creator/twitter::874169451356278784/followers)

- [--] Week [-----] +0.82%
- [--] Month [-----] +1.80%
- [--] Months [-----] +12%
- [--] Year [-----] +24%
### CreatorRank: [---------] [#](/creator/twitter::874169451356278784/influencer_rank)

### Social Influence
**Social category influence**
[products](/list/products) [social networks](/list/social-networks) [technology brands](/list/technology-brands) [fashion brands](/list/fashion-brands) [stocks](/list/stocks) [cryptocurrencies](/list/cryptocurrencies) [finance](/list/finance) [countries](/list/countries)
**Social topic influence**
[model](/topic/model), [code](/topic/code), [ultra](/topic/ultra), [faster](/topic/faster), [apple](/topic/apple), [pip](/topic/pip), [in the](/topic/in-the), [attention](/topic/attention), [swift](/topic/swift), [inference](/topic/inference)
**Top accounts mentioned or mentioned by**
[@awnihannun](/creator/undefined) [@francoisfleuret](/creator/undefined) [@ivanfioravanti](/creator/undefined) [@princecanuma](/creator/undefined) [@idiapch](/creator/undefined) [@apoorv2904](/creator/undefined) [@unixpickle](/creator/undefined) [@epflen](/creator/undefined) [@priontific](/creator/undefined) [@nik0spapp](/creator/undefined) [@alexbarron1](/creator/undefined) [@walkfourmore](/creator/undefined) [@grangierdavid](/creator/undefined) [@mepavelkral](/creator/undefined) [@diganijagrit](/creator/undefined) [@icmlconf](/creator/undefined) [@ylecun](/creator/undefined) [@pierreablin](/creator/undefined) [@neuripsconf](/creator/undefined) [@smallernnspls](/creator/undefined)
### Top Social Posts
Top posts by engagements in the last [--] hours
"I am really excited about our latest work A simple efficient framework to experiment with modern neural networks even on your laptop [--] lines to write a transformer LM ๐ฅณ Just in time for the holidays we are releasing some new software today from Apple machine learning research. MLX is an efficient machine learning framework specifically designed for Apple silicon (i.e. your laptop) Code: https://t.co/Kbis7IrP80 Docs: https://t.co/CUQb80HGut Just in time for the holidays we are releasing some new software today from Apple machine learning research. MLX is an efficient machine learning"
[X Link](https://x.com/anyuser/status/1732304844403429477) 2023-12-06T07:43Z [----] followers, 35.2K engagements
"How about your personal chat GPT on your M2 Ultra Amazing model by Mistral AI and [--] day to implement it in MLX. Mixtral 8x7B in MLX https://t.co/wh4PmlQltK Runs on an M2 Ultra ๐ข๐ข https://t.co/AFeEKyvmUu Mixtral 8x7B in MLX https://t.co/wh4PmlQltK Runs on an M2 Ultra ๐ข๐ข https://t.co/AFeEKyvmUu"
[X Link](https://x.com/anyuser/status/1734653464276602884) 2023-12-12T19:16Z [----] followers, [----] engagements
"@demirbasayyuce @awnihannun Well actually I dont think you need any of that due to unified memory. Quantizing the Lora example in mlx should work out of the box. Havent tried it yet but I dont see why not"
[X Link](https://x.com/anyuser/status/1738008676139745355) 2023-12-22T01:28Z [----] followers, [---] engagements
"We implemented quantization from scratch in a week. I think that is one of the biggest strengths of MLX. Easy to use but also easy to extend and customize. We cant wait to see what people will implement in a month Big update to MLX but especially ๐ฅ N-bit quantization and quantized matmul kernels Thanks to the wizardry of @angeloskath pip install -U mlx https://t.co/YcUhRv31TP Big update to MLX but especially ๐ฅ N-bit quantization and quantized matmul kernels Thanks to the wizardry of @angeloskath pip install -U mlx https://t.co/YcUhRv31TP"
[X Link](https://x.com/anyuser/status/1738075817266356674) 2023-12-22T05:55Z [----] followers, 27.1K engagements
"@WankyuChoi I am super happy you picked it up ๐. I actually added it to the example after seeing your previous demo and comments. Great video as always"
[X Link](https://x.com/anyuser/status/1741224588603007004) 2023-12-30T22:27Z [----] followers, [---] engagements
"@KassinosS @awnihannun Out of curiosity how would a simple relu MLP that passes the inputs through a simple sinusoidal positional encoding do in that problem In my experience they are a pretty good baseline for any such function approximation. See for examples of what I mean. https://bmild.github.io/fourfeat/ https://bmild.github.io/fourfeat/"
[X Link](https://x.com/anyuser/status/1749738202511089668) 2024-01-23T10:17Z [----] followers, [---] engagements
"Looking back at all the amazing things people built with MLX in a couple of months I am incredibly excited to see the things that will be built now in a familiar dev environment in Swift Just [--] lines of code to write a general multi-head attention in MLX Swift ๐๐๐ As part of our goal to make MLX a great research tool we're expanding support to new languages like Swift and C making experimentation on Apple silicon easier for ML researchers. Video generating text with Mistral 7B and MLX Swift ๐ MLX is an array framework for machine https://t.co/nKJcgqePyr As part of our goal to make MLX a"
[X Link](https://x.com/anyuser/status/1760066909557703010) 2024-02-20T22:20Z [----] followers, 11.7K engagements
"What I find even cooler than training on an iPhone is that it is done with just [--] lines of code that are super readable and very familiar to anyone that writes training loops in python. Let's go MLX Swift ๐๐๐ Using MLX Swift to train LeNet on MNIST. Takes less than a minute on my iPhone [--]. Example here: https://t.co/lQs6mECoIK @ylecun long-live MNIST https://t.co/T5eSdcDBM3 Using MLX Swift to train LeNet on MNIST. Takes less than a minute on my iPhone [--]. Example here: https://t.co/lQs6mECoIK @ylecun long-live MNIST https://t.co/T5eSdcDBM3"
[X Link](https://x.com/anyuser/status/1764876370310832509) 2024-03-05T04:51Z [----] followers, 19.3K engagements
"Oh and the model definition looks even more familiar"
[X Link](https://x.com/anyuser/status/1764876372189859872) 2024-03-05T04:51Z [----] followers, [---] engagements
"This is too cool. Now let's combine it with a TTS model and have it tell us nice stories while looking at the beautiful lake. Apple MLX on Vision Pro YES YOU CAN BOOM Here the raw video of MLX Swift LLMEval example running natively on the device Thanks @awnihannun ๐ ๐ฅ๐ฅ๐ฅ #VisionPro #LLM #Apple https://t.co/35T960KopQ Apple MLX on Vision Pro YES YOU CAN BOOM Here the raw video of MLX Swift LLMEval example running natively on the device Thanks @awnihannun ๐ ๐ฅ๐ฅ๐ฅ #VisionPro #LLM #Apple https://t.co/35T960KopQ"
[X Link](https://x.com/anyuser/status/1770516113610424451) 2024-03-20T18:21Z [----] followers, [----] engagements
"I know which model I am uploading to MLX community today ๐ https://t.co/5QagPC5fjx https://t.co/5QagPC5fjx"
[X Link](https://x.com/anyuser/status/1772679052849000714) 2024-03-26T17:36Z [----] followers, [----] engagements
"For the native Greek speakers you can already interact with Meltemi on your laptop directly from HF using MLX. I also uploaded a quantized 4-bit version on mlx-community for faster inference. Almost [--] tokens per second on a MacBook Air and [--] on an M2 Ultra https://t.co/5QagPC5fjx https://t.co/5QagPC5fjx"
[X Link](https://x.com/anyuser/status/1772754914680471688) 2024-03-26T22:38Z [----] followers, [----] engagements
"To reproduce the video above first pip install -U mlx_lm and then python -m mlx_lm.generate --model mlx-community/ilsp-Meltemi-7B-Instruct-v1-4bit --prompt " ." --temp [---] --max-tokens [----] on any M-series Mac"
[X Link](https://x.com/anyuser/status/1772754916672786561) 2024-03-26T22:38Z [----] followers, [---] engagements
"Congrats to all the researchers from ILSP and Athena research center that worked on this I couldn't find twitter handles to tag people so please let me know if I should be tagging someone"
[X Link](https://x.com/anyuser/status/1772754917968773405) 2024-03-26T22:38Z [----] followers, [---] engagements
"@walkfourmore Hmm just ran it on an 8GB M1 Mac mini (same chip) and it gets a very respectable [--] tps. Feel free to file an issue on GitHub with details to help you debug your setup. Otherwise doing a fresh install on a new python environment should probably be enough"
[X Link](https://x.com/angeloskath/status/1773855822701093294) 2024-03-29T23:32Z [----] followers, [--] engagements
"@walkfourmore https://github.com/ml-explore/mlx-examples https://github.com/ml-explore/mlx https://github.com/ml-explore/mlx-examples https://github.com/ml-explore/mlx"
[X Link](https://x.com/angeloskath/status/1773936864346714284) 2024-03-30T04:54Z [----] followers, [--] engagements
"@walkfourmore You can fine tune it using LoRA on your laptop (see the MLX examples). An 8GB MacBook Air wont break any speed records but you can easily fine tune it on your data over night if they are about a book long"
[X Link](https://x.com/anyuser/status/1774190727863664886) 2024-03-30T21:43Z [----] followers, [--] engagements
"I have to say it because @awnihannun is quick to give credit to others but doesnt take much for himself. This performance improvement largely comes from his relentless hunting down of every kind of overhead in MLX the past weeks. Kudos MLX [----] [----] faster generation across model sizes and machines. tokens-per-second for 4-bit models: https://t.co/o5f3WmVzGZ MLX [----] [----] faster generation across model sizes and machines. tokens-per-second for 4-bit models: https://t.co/o5f3WmVzGZ"
[X Link](https://x.com/anyuser/status/1781772786366857445) 2024-04-20T19:51Z [----] followers, 17.8K engagements
"My favorite new addition to MLX in v0.14 is the option to just-in-time compile the kernels in order to create a small binary to ease deployment More than 10x reduction to the Metal library size. https://ml-explore.github.io/mlx/build/html/install.html#binary-size-minimization pip install -U mlx https://t.co/CJYcIRJr9J https://ml-explore.github.io/mlx/build/html/install.html#binary-size-minimization pip install -U mlx https://t.co/CJYcIRJr9J"
[X Link](https://x.com/anyuser/status/1794062376796623122) 2024-05-24T17:46Z [----] followers, [----] engagements
"Effort led by @awnihannun and I have to say that readability and development did not suffer at all to enable this new capability. If anything kernel instantiation (using the preprocessor) and kernel definition are now nicely separated"
[X Link](https://x.com/anyuser/status/1794062378252005458) 2024-05-24T17:46Z [----] followers, [---] engagements
"If you are at #ICML2024 last chance today to play with MLX on the M2 Ultra or iPad at the Apple booth If you are a fan already come tell us the cool things you built"
[X Link](https://x.com/anyuser/status/1816023031132668275) 2024-07-24T08:10Z [----] followers, [----] engagements
"Love this I think the barrier to entry for playing with LLMs doesnt get any lower Made a minimal but fast example of text generation with Llama [---] in MLX. Minimal: [--] file [---] lines of simple code [--] dependencies Fast: 100+ toks/sec with 4-bit 8B on M2 Ultra Code: https://t.co/aCfcEG7geA https://t.co/mw9PhLilBn Made a minimal but fast example of text generation with Llama [---] in MLX. Minimal: [--] file [---] lines of simple code [--] dependencies Fast: 100+ toks/sec with 4-bit 8B on M2 Ultra Code: https://t.co/aCfcEG7geA https://t.co/mw9PhLilBn"
[X Link](https://x.com/anyuser/status/1821980673516962152) 2024-08-09T18:43Z [----] followers, [----] engagements
"It is quite easy to underestimate how hard that is to do in any other framework Lazy evaluation in MLX is a very cool feature but takes some getting used to. Here's one example where it is useful: for low RAM devices lazy loading/computation makes it almost trivial to stream a model from disk without ever materializing the full thing. https://t.co/nodT6gLYza Lazy evaluation in MLX is a very cool feature but takes some getting used to. Here's one example where it is useful: for low RAM devices lazy loading/computation makes it almost trivial to stream a model from disk without ever"
[X Link](https://x.com/anyuser/status/1831808320270999990) 2024-09-05T21:35Z [----] followers, [----] engagements
"@priontific This is possible in all frameworks. The only thing that changes is simplicity and efficiency. Eg I am pretty sure that in PyTorch one would have to manually unload layers mid computation. Simply put with MLX the efficiency is great and I dont think it gets any simpler"
[X Link](https://x.com/anyuser/status/1831959316775211294) 2024-09-06T07:35Z [----] followers, [--] engagements
"It is rare to see such a thorough and complete analysis nowadays. [--] awesome pages of appendix Any comparison/ablation you didn't know you wanted is probably there already. It was a privilege to see it being developed by Matteo David and Pierre. Stop discarding your old gradients Introducing AdEMAMix a novel (first-order) optimizer capable of outperforming Adam. Lets have a thread on momentum and the surprising relevance of very old gradients. A joint work with @GrangierDavid and @PierreAblin #ml #optimization 1/๐งต https://t.co/MbGVcSIPdg Stop discarding your old gradients Introducing"
[X Link](https://x.com/anyuser/status/1832179810124362118) 2024-09-06T22:11Z [----] followers, [----] engagements
"This is very cool It has also been a few times I have wanted to shoutout at . @zcbenz has done an awesome job with the MLX bindings to Node.js. They are quite enjoyable to read with the proper amount of JS specific add-ons or quirks. https://github.com/frost-beta/node-mlx I wrote a CLI tool for semantic images search using Node.js and MLX. No third party API is used everything runs locally. Index is built of image embeddings with CLIP model and searching is just computing cosine similarities. https://t.co/QNuwkKX1x6 https://t.co/YNNgmetG9k https://github.com/frost-beta/node-mlx I wrote a CLI"
[X Link](https://x.com/anyuser/status/1835722216673300773) 2024-09-16T16:47Z [----] followers, [---] engagements
"An interesting take-away from this work is that sequence-level routing works in real world tasks like question answering by routing based on a short prefix. This enables practically communication-free training of a traditional mixture of experts model and cheap inference. Dont have fast interconnect No problem No need for nodes to talk We introduce SMALLTALK LM an innovative method for training a mixture of language models (almost) asynchronously. SMALLTALK LM achieves better perplexity and better accuracy on a majority of downstream tasks https://t.co/ZWn0eQxzEe Dont have fast interconnect"
[X Link](https://x.com/anyuser/status/1844626390353838452) 2024-10-11T06:29Z [----] followers, [----] engagements
"Fantastic work by @NasFilippova before even starting her PhD ๐"
[X Link](https://x.com/anyuser/status/1844626391612129537) 2024-10-11T06:29Z [----] followers, [---] engagements
"@MePavelKral @awnihannun Its currently around 50GB but there are many things left to do to get this down to less than [--]. More PRs coming โบ"
[X Link](https://x.com/anyuser/status/1844815397675213161) 2024-10-11T19:00Z [----] followers, [--] engagements
"@priontific @MePavelKral @awnihannun Hmm try to not generate progress images it saves some memory (it shouldn't but it does will look into it). For 512x512 batch size [--] I am getting 49GB so it should be doable without swap. Agreed though QLoRA (and/or checkpointing) is gonna make a huge difference"
[X Link](https://x.com/anyuser/status/1844927281325572454) 2024-10-12T02:25Z [----] followers, [---] engagements
"Now merged with a small README to help people get started. Enjoy Clear your weekend for some FLUX LoRA fine-tuning in MLX thanks to @angeloskath https://t.co/PeM8vnHoZt Clear your weekend for some FLUX LoRA fine-tuning in MLX thanks to @angeloskath https://t.co/PeM8vnHoZt"
[X Link](https://x.com/anyuser/status/1844957614402396373) 2024-10-12T04:25Z [----] followers, [----] engagements
"@awnihannun @priontific It is also the schnell model which is significantly harder to fine tune"
[X Link](https://x.com/anyuser/status/1845185145772585346) 2024-10-12T19:29Z [----] followers, [---] engagements
"In the new MLX release we break the 110tps barrier out of the box for Mistral 7B Here is the speed (before and after) of a bunch of LMs writing a story about Einstein on an M2 Ultra"
[X Link](https://x.com/anyuser/status/1847367375467008296) 2024-10-18T20:01Z [----] followers, [----] engagements
"@ruairiSpain This is all coming from the fused attention kernel which we just added. It may even have some space for further optimization in the next few days"
[X Link](https://x.com/anyuser/status/1847381368445288593) 2024-10-18T20:56Z [----] followers, [---] engagements
"It has been some time since we added distributed support to MLX. Even via 10Gbps Ethernet distributed LoRA scales perfectly. I can't wait for people to train large adapters via thunderbolt or small ones over Wi-Fi. There is so much compute lying around (nodes are M2 Ultras)"
[X Link](https://x.com/anyuser/status/1852097200064774461) 2024-10-31T21:15Z [----] followers, [----] engagements
"@Prince_Canuma There is but they need a bit of a refresh. I will see if it makes sense to make a small launch helper script that will prevent a bunch of "foot-guns" related to launching MPI jobs. https://ml-explore.github.io/mlx/build/html/usage/distributed.html https://ml-explore.github.io/mlx/build/html/usage/distributed.html"
[X Link](https://x.com/anyuser/status/1852122737361432795) 2024-10-31T22:57Z [----] followers, [---] engagements
"@Prince_Canuma Here is the PR It's been there months ๐
we 'll merge it shortly. TL;DR: The mlx_lm.lora supports distributed finetuning. All you have to do is launch it with mpirun . https://github.com/ml-explore/mlx-examples/pull/821 https://github.com/ml-explore/mlx-examples/pull/821"
[X Link](https://x.com/anyuser/status/1852128822436729241) 2024-10-31T23:21Z [----] followers, [---] engagements
"@Prince_Canuma Indeed. It is also the case for flux finetuning btw"
[X Link](https://x.com/anyuser/status/1852131124224700669) 2024-10-31T23:30Z [----] followers, [--] engagements
"@Prince_Canuma Each machine has the full model ie this is simple data parallel training. Given that the Macs have comparatively very high memory capacity per FLOP ratio I think it is also the best strategy. Combined with grad accumulation and checkpointing you can fine-tune almost anything"
[X Link](https://x.com/anyuser/status/1852132990702490002) 2024-10-31T23:38Z [----] followers, [---] engagements
"@Prince_Canuma Yeah of course you can implement it. But fyi pipeline training is significantly more complicated than pipeline inference (and harder to get linear scaling as well)"
[X Link](https://x.com/anyuser/status/1852146572789956937) 2024-11-01T00:32Z [----] followers, [--] engagements
"@unixpickle Indeed but isnt this like 90% of fine-tuning runs Ymmv but for a 14B even 200M parameters LoRA scales very reasonably via Ethernet. Something like 5x speed up for [--] nodes"
[X Link](https://x.com/anyuser/status/1852211131307237426) 2024-11-01T04:48Z [----] followers, [--] engagements
"@ivanfioravanti Thats exciting ๐. Ethernet thunderbolt or WiFi"
[X Link](https://x.com/anyuser/status/1853163042135068947) 2024-11-03T19:51Z [----] followers, [--] engagements
"@ivanfioravanti Looking at that you should probably also increase the batch size. It will help amortize the communication cost. Here each machine deals with [--] sequences I would do [--] per machine meaning [--] total"
[X Link](https://x.com/anyuser/status/1853186254944477679) 2024-11-03T21:23Z [----] followers, [--] engagements
"At these speeds who needs RAG am I right In the latest MLX generating with long prompts is much faster with KV quantization. Thanks to @alex_barron1. 4-bit Llama 8B with a [-----] token prompt + 8-bit KV cache generates at [--] toks/sec on an M2 Ultra: https://t.co/TlBZTOhpfD In the latest MLX generating with long prompts is much faster with KV quantization. Thanks to @alex_barron1. 4-bit Llama 8B with a [-----] token prompt + 8-bit KV cache generates at [--] toks/sec on an M2 Ultra: https://t.co/TlBZTOhpfD"
[X Link](https://x.com/anyuser/status/1854284083662324148) 2024-11-06T22:05Z [----] followers, [---] engagements
"@unixpickle Our convs (especially the backward) do need some love. We are working on it and they will be faster soon ๐"
[X Link](https://x.com/anyuser/status/1859751519643632052) 2024-11-22T00:11Z [----] followers, [---] engagements
"@caviterginsoy @ivanfioravanti @alex_barron1 @DiganiJagrit Neglected a bit for sure but what about ๐ https://github.com/ml-explore/mlx-data https://github.com/ml-explore/mlx-data"
[X Link](https://x.com/anyuser/status/1860076969700036998) 2024-11-22T21:44Z [----] followers, [--] engagements
"@ivanfioravanti Hm feel free to drop an issue if you think that there is something to be fixed"
[X Link](https://x.com/anyuser/status/1908187050945437733) 2025-04-04T15:57Z [----] followers, [---] engagements
"@DiganiJagrit has made awesome building blocks to write matmul kernels in MLX. It would be so much more work to write this otherwise and it would also be slower. Latest mlx + mlx-lm have much faster prompt processing speeds for MoEs. Thanks to some magic from @angeloskath pip install -U mlx-lm Deep Seek v3 up to 2x faster Mixtral up to 3.5x faster Llama [--] up to 2x faster Latest mlx + mlx-lm have much faster prompt processing speeds for MoEs. Thanks to some magic from @angeloskath pip install -U mlx-lm Deep Seek v3 up to 2x faster Mixtral up to 3.5x faster Llama [--] up to 2x faster"
[X Link](https://x.com/anyuser/status/1913341494792241245) 2025-04-18T21:18Z [----] followers, [----] engagements
"Kudos to David Koski From discussion to merged PR in [--] days. You can't beat open source We heard developer feedback on Monday that the MLX Swift LLM API is hard to get started with. So we went to work. We made a new and improved streamlined API. Now [--] lines toload an LLM or VLM and start a chat session in a Swift project. Happy last day of #WWDC25 https://t.co/NA00RJqjVR We heard developer feedback on Monday that the MLX Swift LLM API is hard to get started with. So we went to work. We made a new and improved streamlined API. Now [--] lines toload an LLM or VLM and start a chat session in a"
[X Link](https://x.com/anyuser/status/1933595157708091849) 2025-06-13T18:39Z [----] followers, [----] engagements
"@awnihannun added batched generation to MLX-LM [--] months ago. Everybody since has been asking for batching in the MLX-LM server. Well enjoy the first version in the latest MLX-LM release. The following video is serving [--] consecutive requests for Qwen3 30B on an M2 Ultra"
[X Link](https://x.com/angeloskath/status/1996364526749639032) 2025-12-03T23:42Z [----] followers, 14.2K engagements
"@RickRossTN @awnihannun Nice It depends on the model as well. MoEs will hit more experts when batched so they will scale a little worse than dense models"
[X Link](https://x.com/angeloskath/status/1998536452855181366) 2025-12-09T23:33Z [----] followers, [--] engagements
"@ivanfioravanti @awnihannun ๐ Let us know how it goes Deepseek V3.2 won't work with chat cause the chat template is missing. I changed it in the docs sorry for the large download"
[X Link](https://x.com/angeloskath/status/2001372368544129502) 2025-12-17T19:22Z [----] followers, [---] engagements
"Low latency communication is crucial for tensor parallel inference which is now available on the latest mlx-lm (not on pypi yet). In the following video Devstral is generating a quicksort in C++ 1.7x faster on [--] M3 Ultras (right) vs on [--] (left). The latest MLX is out And ithas a new distributed back-end (JACCL) that uses RDMA over TB5 for super low-latency communication across multiple Macs. Thanks to @angeloskath https://t.co/254dMxND9W The latest MLX is out And ithas a new distributed back-end (JACCL) that uses RDMA over TB5 for super low-latency communication across multiple Macs. Thanks"
[X Link](https://x.com/angeloskath/status/2001739468425040002) 2025-12-18T19:40Z [----] followers, [----] engagements
"Latest MLX release on macOS [----] has an updated JACCL with significantly increased distributed bandwidth. (Merged PR This results in great prompt processing scaling and reduced time to first token 3.3x speedup for both MoEs and dense models (4 nodes). https://github.com/ml-explore/mlx/pull/3094 https://github.com/ml-explore/mlx/pull/3094"
[X Link](https://x.com/angeloskath/status/2019968198322577821) 2026-02-07T02:55Z [----] followers, 18.7K engagements
"Moving the latest LLM from one Mac to the other can now be done much faster using mlx distributed. SSD write speed is now the bottleneck at around [---] GB/s. Here broadcasting M2.5 8bit in 45s to [--] other Macs. Needs macOS [----] and mlx-lm main not yet on PyPi"
[X Link](https://x.com/angeloskath/status/2022502242893586670) 2026-02-14T02:44Z [----] followers, [----] engagements
"I am really excited about our latest work A simple efficient framework to experiment with modern neural networks even on your laptop [--] lines to write a transformer LM ๐ฅณ Just in time for the holidays we are releasing some new software today from Apple machine learning research. MLX is an efficient machine learning framework specifically designed for Apple silicon (i.e. your laptop) Code: https://t.co/Kbis7IrP80 Docs: https://t.co/CUQb80HGut Just in time for the holidays we are releasing some new software today from Apple machine learning research. MLX is an efficient machine learning"
[X Link](https://x.com/anyuser/status/1732304844403429477) 2023-12-06T07:43Z [----] followers, 35.2K engagements
"Just in time for the holidays we are releasing some new software today from Apple machine learning research. MLX is an efficient machine learning framework specifically designed for Apple silicon (i.e. your laptop) Code: Docs: https://ml-explore.github.io/mlx/build/html/index.html https://github.com/ml-explore/mlx https://ml-explore.github.io/mlx/build/html/index.html https://github.com/ml-explore/mlx"
[X Link](https://x.com/anyuser/status/1732184443451019431) 2023-12-05T23:45Z 37.8K followers, 901.3K engagements
"We implemented quantization from scratch in a week. I think that is one of the biggest strengths of MLX. Easy to use but also easy to extend and customize. We cant wait to see what people will implement in a month Big update to MLX but especially ๐ฅ N-bit quantization and quantized matmul kernels Thanks to the wizardry of @angeloskath pip install -U mlx https://t.co/YcUhRv31TP Big update to MLX but especially ๐ฅ N-bit quantization and quantized matmul kernels Thanks to the wizardry of @angeloskath pip install -U mlx https://t.co/YcUhRv31TP"
[X Link](https://x.com/anyuser/status/1738075817266356674) 2023-12-22T05:55Z [----] followers, 27.1K engagements
"Big update to MLX but especially ๐ฅ N-bit quantization and quantized matmul kernels Thanks to the wizardry of @angeloskath pip install -U mlx"
[X Link](https://x.com/anyuser/status/1738030331180253610) 2023-12-22T02:54Z 37.8K followers, 121.6K engagements
"What I find even cooler than training on an iPhone is that it is done with just [--] lines of code that are super readable and very familiar to anyone that writes training loops in python. Let's go MLX Swift ๐๐๐ Using MLX Swift to train LeNet on MNIST. Takes less than a minute on my iPhone [--]. Example here: https://t.co/lQs6mECoIK @ylecun long-live MNIST https://t.co/T5eSdcDBM3 Using MLX Swift to train LeNet on MNIST. Takes less than a minute on my iPhone [--]. Example here: https://t.co/lQs6mECoIK @ylecun long-live MNIST https://t.co/T5eSdcDBM3"
[X Link](https://x.com/anyuser/status/1764876370310832509) 2024-03-05T04:51Z [----] followers, 19.3K engagements
"Using MLX Swift to train LeNet on MNIST. Takes less than a minute on my iPhone [--]. Example here: @ylecun long-live MNIST https://github.com/ml-explore/mlx-swift-examples https://github.com/ml-explore/mlx-swift-examples"
[X Link](https://x.com/anyuser/status/1764872668560679000) 2024-03-05T04:36Z 37.8K followers, 93.8K engagements
"I have to say it because @awnihannun is quick to give credit to others but doesnt take much for himself. This performance improvement largely comes from his relentless hunting down of every kind of overhead in MLX the past weeks. Kudos MLX [----] [----] faster generation across model sizes and machines. tokens-per-second for 4-bit models: https://t.co/o5f3WmVzGZ MLX [----] [----] faster generation across model sizes and machines. tokens-per-second for 4-bit models: https://t.co/o5f3WmVzGZ"
[X Link](https://x.com/anyuser/status/1781772786366857445) 2024-04-20T19:51Z [----] followers, 17.8K engagements
"MLX [----] [----] faster generation across model sizes and machines. tokens-per-second for 4-bit models:"
[X Link](https://x.com/anyuser/status/1781759608966860981) 2024-04-20T18:59Z 37.8K followers, 29.2K engagements
"It has been some time since we added distributed support to MLX. Even via 10Gbps Ethernet distributed LoRA scales perfectly. I can't wait for people to train large adapters via thunderbolt or small ones over Wi-Fi. There is so much compute lying around (nodes are M2 Ultras)"
[X Link](https://x.com/anyuser/status/1852097200064774461) 2024-10-31T21:15Z [----] followers, [----] engagements
"If you are at #ICML2024 last chance today to play with MLX on the M2 Ultra or iPad at the Apple booth If you are a fan already come tell us the cool things you built"
[X Link](https://x.com/anyuser/status/1816023031132668275) 2024-07-24T08:10Z [----] followers, [----] engagements
"Looking back at all the amazing things people built with MLX in a couple of months I am incredibly excited to see the things that will be built now in a familiar dev environment in Swift Just [--] lines of code to write a general multi-head attention in MLX Swift ๐๐๐ As part of our goal to make MLX a great research tool we're expanding support to new languages like Swift and C making experimentation on Apple silicon easier for ML researchers. Video generating text with Mistral 7B and MLX Swift ๐ MLX is an array framework for machine https://t.co/nKJcgqePyr As part of our goal to make MLX a"
[X Link](https://x.com/anyuser/status/1760066909557703010) 2024-02-20T22:20Z [----] followers, 11.7K engagements
"As part of our goal to make MLX a great research tool we're expanding support to new languages like Swift and C making experimentation on Apple silicon easier for ML researchers. Video generating text with Mistral 7B and MLX Swift ๐ MLX is an array framework for machine learning research on Apple silicon. MLX is intended for research and not for production deployment of models in apps"
[X Link](https://x.com/anyuser/status/1760032275893526643) 2024-02-20T20:02Z 37.8K followers, 89K engagements
"In the new MLX release we break the 110tps barrier out of the box for Mistral 7B Here is the speed (before and after) of a bunch of LMs writing a story about Einstein on an M2 Ultra"
[X Link](https://x.com/anyuser/status/1847367375467008296) 2024-10-18T20:01Z [----] followers, [----] engagements
"Code is also available If you want to experiment with clustered attention all you need to do is pip install pytorch-fast-transformers and then use attention_type="improved-clustered". Enjoy One paper accepted at @NeurIPSConf with @apoorv2904 and @angeloskath on speeding up attention by clustering the queries. The nice thing is that this can be used for inference with standard pre-trained models. https://t.co/lSK2DgGiQC @Idiap_ch @unige_en @EPFL_en @snsf_ch One paper accepted at @NeurIPSConf with @apoorv2904 and @angeloskath on speeding up attention by clustering the queries. The nice thing is"
[X Link](https://x.com/anyuser/status/1309813949446184961) 2020-09-26T11:15Z [----] followers, [--] engagements
"One paper accepted at @NeurIPSConf with @apoorv2904 and @angeloskath on speeding up attention by clustering the queries. The nice thing is that this can be used for inference with standard pre-trained models. @Idiap_ch @unige_en @EPFL_en @snsf_ch https://arxiv.org/abs/2007.04825 https://arxiv.org/abs/2007.04825"
[X Link](https://x.com/anyuser/status/1309776077917761536) 2020-09-26T08:45Z 48.7K followers, [---] engagements
"How about your personal chat GPT on your M2 Ultra Amazing model by Mistral AI and [--] day to implement it in MLX. Mixtral 8x7B in MLX https://t.co/wh4PmlQltK Runs on an M2 Ultra ๐ข๐ข https://t.co/AFeEKyvmUu Mixtral 8x7B in MLX https://t.co/wh4PmlQltK Runs on an M2 Ultra ๐ข๐ข https://t.co/AFeEKyvmUu"
[X Link](https://x.com/anyuser/status/1734653464276602884) 2023-12-12T19:16Z [----] followers, [----] engagements
"Mixtral 8x7B in MLX Runs on an M2 Ultra ๐ข๐ข https://github.com/ml-explore/mlx-examples/tree/main/mixtral https://github.com/ml-explore/mlx-examples/tree/main/mixtral"
[X Link](https://x.com/anyuser/status/1734646740144345487) 2023-12-12T18:49Z 37.8K followers, 240.4K engagements
"I assembled the @NeurIPSConf [----] accepted papers in a list that is easy to filter by author name affiliation and paper title. Which company do you think has the most first author papers https://angeloskath.github.io/neurips-2020-accepted-papers.html https://angeloskath.github.io/neurips-2020-accepted-papers.html"
[X Link](https://x.com/anyuser/status/1313621960732049409) 2020-10-06T23:27Z [----] followers, [--] engagements
"@GoogleAI @Pablogomez3 For the "few" of us that don't use JAX yet you can now experiment with FAVOR+ (and other Fourier features) in @PyTorch using our fast-transformers library with just [--] lines of code. Code: Docs: http://fast-transformers.github.io/feature_maps/ http://github.com/idiap/fast-transformers http://fast-transformers.github.io/feature_maps/ http://github.com/idiap/fast-transformers"
[X Link](https://x.com/anyuser/status/1319757441031168001) 2020-10-23T21:47Z [----] followers, [--] engagements
"For the native Greek speakers you can already interact with Meltemi on your laptop directly from HF using MLX. I also uploaded a quantized 4-bit version on mlx-community for faster inference. Almost [--] tokens per second on a MacBook Air and [--] on an M2 Ultra https://t.co/5QagPC5fjx https://t.co/5QagPC5fjx"
[X Link](https://x.com/anyuser/status/1772754914680471688) 2024-03-26T22:38Z [----] followers, [----] engagements
"https://medium.com/institute-for-language-and-speech-processing/meltemi-a-large-language-model-for-greek-9f5ef1d4a10f https://medium.com/institute-for-language-and-speech-processing/meltemi-a-large-language-model-for-greek-9f5ef1d4a10f"
[X Link](https://x.com/anyuser/status/1772660707122728965) 2024-03-26T16:23Z [---] followers, 10.9K engagements
"Kudos to David Koski From discussion to merged PR in [--] days. You can't beat open source We heard developer feedback on Monday that the MLX Swift LLM API is hard to get started with. So we went to work. We made a new and improved streamlined API. Now [--] lines toload an LLM or VLM and start a chat session in a Swift project. Happy last day of #WWDC25 https://t.co/NA00RJqjVR We heard developer feedback on Monday that the MLX Swift LLM API is hard to get started with. So we went to work. We made a new and improved streamlined API. Now [--] lines toload an LLM or VLM and start a chat session in a"
[X Link](https://x.com/anyuser/status/1933595157708091849) 2025-06-13T18:39Z [----] followers, [----] engagements
"We heard developer feedback on Monday that the MLX Swift LLM API is hard to get started with. So we went to work. We made a new and improved streamlined API. Now [--] lines toload an LLM or VLM and start a chat session in a Swift project. Happy last day of #WWDC25"
[X Link](https://x.com/anyuser/status/1933570521104593127) 2025-06-13T17:01Z 37.8K followers, 49.9K engagements
"My favorite new addition to MLX in v0.14 is the option to just-in-time compile the kernels in order to create a small binary to ease deployment More than 10x reduction to the Metal library size. https://ml-explore.github.io/mlx/build/html/install.html#binary-size-minimization pip install -U mlx https://t.co/CJYcIRJr9J https://ml-explore.github.io/mlx/build/html/install.html#binary-size-minimization pip install -U mlx https://t.co/CJYcIRJr9J"
[X Link](https://x.com/anyuser/status/1794062376796623122) 2024-05-24T17:46Z [----] followers, [----] engagements
"pip install -U mlx"
[X Link](https://x.com/anyuser/status/1793835604670902620) 2024-05-24T02:45Z 37.8K followers, 17.9K engagements
"Now merged with a small README to help people get started. Enjoy Clear your weekend for some FLUX LoRA fine-tuning in MLX thanks to @angeloskath https://t.co/PeM8vnHoZt Clear your weekend for some FLUX LoRA fine-tuning in MLX thanks to @angeloskath https://t.co/PeM8vnHoZt"
[X Link](https://x.com/anyuser/status/1844957614402396373) 2024-10-12T04:25Z [----] followers, [----] engagements
"Clear your weekend for some FLUX LoRA fine-tuning in MLX thanks to @angeloskath https://github.com/ml-explore/mlx-examples/pull/1028 https://github.com/ml-explore/mlx-examples/pull/1028"
[X Link](https://x.com/anyuser/status/1844572406381474205) 2024-10-11T02:55Z 37.8K followers, 19.9K engagements
"I feel very lucky to have been at Idiap it is a great place to pursue a PhD. I would also like to thank @francoisfleuret. I couldn't have asked for a better PhD advisor Idiaper wins @EPFL's EEDE Thesis Award ๐ Former #PhD from our institute @angeloskath has received EPFL's Electrical Engineering Doctoral program (#EEDE) Thesis Award for his outstanding research on the efficiency of #DeepLearning models. https://t.co/LO8wgKtc4U Idiaper wins @EPFL's EEDE Thesis Award ๐ Former #PhD from our institute @angeloskath has received EPFL's Electrical Engineering Doctoral program (#EEDE) Thesis Award"
[X Link](https://x.com/anyuser/status/1595851767194873856) 2022-11-24T18:48Z [----] followers, [--] engagements
"Idiaper wins @EPFL's EEDE Thesis Award ๐ Former #PhD from our institute @angeloskath has received EPFL's Electrical Engineering Doctoral program (#EEDE) Thesis Award for his outstanding research on the efficiency of #DeepLearning models. https://www.idiap.ch/en/allnews/idiaper-wins-epfls-eede-thesis-award https://www.idiap.ch/en/allnews/idiaper-wins-epfls-eede-thesis-award"
[X Link](https://x.com/anyuser/status/1595787376156106752) 2022-11-24T14:32Z [----] followers, [--] engagements
"It is quite easy to underestimate how hard that is to do in any other framework Lazy evaluation in MLX is a very cool feature but takes some getting used to. Here's one example where it is useful: for low RAM devices lazy loading/computation makes it almost trivial to stream a model from disk without ever materializing the full thing. https://t.co/nodT6gLYza Lazy evaluation in MLX is a very cool feature but takes some getting used to. Here's one example where it is useful: for low RAM devices lazy loading/computation makes it almost trivial to stream a model from disk without ever"
[X Link](https://x.com/anyuser/status/1831808320270999990) 2024-09-05T21:35Z [----] followers, [----] engagements
"Lazy evaluation in MLX is a very cool feature but takes some getting used to. Here's one example where it is useful: for low RAM devices lazy loading/computation makes it almost trivial to stream a model from disk without ever materializing the full thing"
[X Link](https://x.com/anyuser/status/1831806302722650576) 2024-09-05T21:27Z 37.8K followers, [----] engagements
"Thank you Yannic for the amazing video. The topic modeling intuition is a very interesting way to think about it and I hadn't thought of the kernels this way. Anybody that doesn't follow Yannic is seriously missing out Check out his channel https://www.youtube.com/c/YannicKilcher New Video ๐ฅ No more O(N2) complexity in Transformers: Kernels to the rescue ๐ฅณ This paper makes Attention linear AND shows an intriguing connection between Transformers and RNNs ๐ช https://t.co/jtFECITlpD @angeloskath @apoorv2904 @nik0spapp @francoisfleuret @EPFL_en @Idiap_ch https://t.co/QDxwkNM3jU"
[X Link](https://x.com/anyuser/status/1279408801549033472) 2020-07-04T13:36Z [----] followers, [--] engagements
"New Video ๐ฅ No more O(N2) complexity in Transformers: Kernels to the rescue ๐ฅณ This paper makes Attention linear AND shows an intriguing connection between Transformers and RNNs ๐ช @angeloskath @apoorv2904 @nik0spapp @francoisfleuret @EPFL_en @Idiap_ch https://youtu.be/hAooAOFRsYc https://youtu.be/hAooAOFRsYc"
[X Link](https://x.com/anyuser/status/1279396278590455809) 2020-07-04T12:46Z 84.2K followers, [---] engagements
"This is too cool. Now let's combine it with a TTS model and have it tell us nice stories while looking at the beautiful lake. Apple MLX on Vision Pro YES YOU CAN BOOM Here the raw video of MLX Swift LLMEval example running natively on the device Thanks @awnihannun ๐ ๐ฅ๐ฅ๐ฅ #VisionPro #LLM #Apple https://t.co/35T960KopQ Apple MLX on Vision Pro YES YOU CAN BOOM Here the raw video of MLX Swift LLMEval example running natively on the device Thanks @awnihannun ๐ ๐ฅ๐ฅ๐ฅ #VisionPro #LLM #Apple https://t.co/35T960KopQ"
[X Link](https://x.com/anyuser/status/1770516113610424451) 2024-03-20T18:21Z [----] followers, [----] engagements
"Apple MLX on Vision Pro YES YOU CAN BOOM Here the raw video of MLX Swift LLMEval example running natively on the device Thanks @awnihannun ๐ ๐ฅ๐ฅ๐ฅ #VisionPro #LLM #Apple"
[X Link](https://x.com/anyuser/status/1770509332817334497) 2024-03-20T17:54Z 20.4K followers, 40.2K engagements
"What a game #PameStefane #RolandGarros #Tsitsipas"
[X Link](https://x.com/anyuser/status/1404094278217969666) 2021-06-13T15:12Z [----] followers, [--] engagements
"Because you haven't really released code until you release the documentation. I just finished the first version of docs for our ICML2019 paper You can find it at Oh also you can just pip install attention-sampling . https://x.com/francoisfleuret/status/1126813878812323841 http://attention-sampling.com/ And here it is on @arxiv https://t.co/YK9y6YxWBT TL;DR: A network computes an attention map on a downscaled image and another processes locations sampled according to that map. The pair can be trained end-to-end. https://x.com/francoisfleuret/status/1126813878812323841"
[X Link](https://x.com/anyuser/status/1153402206000951296) 2019-07-22T20:31Z [----] followers, [--] engagements
"And here it is on @arxiv TL;DR: A network computes an attention map on a downscaled image and another processes locations sampled according to that map. The pair can be trained end-to-end. https://arxiv.org/abs/1905.03711 https://arxiv.org/abs/1905.03711"
[X Link](https://x.com/anyuser/status/1126813878812323841) 2019-05-10T11:38Z 48.7K followers, [---] engagements
"@DiganiJagrit has made awesome building blocks to write matmul kernels in MLX. It would be so much more work to write this otherwise and it would also be slower. Latest mlx + mlx-lm have much faster prompt processing speeds for MoEs. Thanks to some magic from @angeloskath pip install -U mlx-lm Deep Seek v3 up to 2x faster Mixtral up to 3.5x faster Llama [--] up to 2x faster Latest mlx + mlx-lm have much faster prompt processing speeds for MoEs. Thanks to some magic from @angeloskath pip install -U mlx-lm Deep Seek v3 up to 2x faster Mixtral up to 3.5x faster Llama [--] up to 2x faster"
[X Link](https://x.com/anyuser/status/1913341494792241245) 2025-04-18T21:18Z [----] followers, [----] engagements
"Latest mlx + mlx-lm have much faster prompt processing speeds for MoEs. Thanks to some magic from @angeloskath pip install -U mlx-lm Deep Seek v3 up to 2x faster Mixtral up to 3.5x faster Llama [--] up to 2x faster"
[X Link](https://x.com/anyuser/status/1913323925301502164) 2025-04-18T20:09Z 37.8K followers, 17.3K engagements
"Love this I think the barrier to entry for playing with LLMs doesnt get any lower Made a minimal but fast example of text generation with Llama [---] in MLX. Minimal: [--] file [---] lines of simple code [--] dependencies Fast: 100+ toks/sec with 4-bit 8B on M2 Ultra Code: https://t.co/aCfcEG7geA https://t.co/mw9PhLilBn Made a minimal but fast example of text generation with Llama [---] in MLX. Minimal: [--] file [---] lines of simple code [--] dependencies Fast: 100+ toks/sec with 4-bit 8B on M2 Ultra Code: https://t.co/aCfcEG7geA https://t.co/mw9PhLilBn"
[X Link](https://x.com/anyuser/status/1821980673516962152) 2024-08-09T18:43Z [----] followers, [----] engagements
"Made a minimal but fast example of text generation with Llama [---] in MLX. Minimal: [--] file [---] lines of simple code [--] dependencies Fast: 100+ toks/sec with 4-bit 8B on M2 Ultra Code: https://gist.github.com/awni/cf42588b8c084c3d93d7373b604c7f9c https://gist.github.com/awni/cf42588b8c084c3d93d7373b604c7f9c"
[X Link](https://x.com/anyuser/status/1821921920054624710) 2024-08-09T14:50Z 37.8K followers, 13.1K engagements
"I assembled the @icmlconf [----] accepted papers in a list that is easy to filter based for instance on affiliations or title. First authors from: Google [--] Microsoft [--] Facebook [--] Amazon [--] Apple [--] @EPFL_en [--] @ETH [--] #ICML2019 http://idiap.ch/katharas/pages/accepted-papers-at-icml-2019.html http://idiap.ch/katharas/pages/accepted-papers-at-icml-2019.html"
[X Link](https://x.com/anyuser/status/1127206321462415360) 2019-05-11T13:38Z [----] followers, [--] engagements
"@unixpickle @awnihannun Unified memory is the big one. The fast Metal kernels and linking to accelerate or Apple specific SIMD instructions would be another one. We are very excited to explore what new architecture the above will enable or the impact to the existing ones"
[X Link](https://x.com/anyuser/status/1732232136269418930) 2023-12-06T02:54Z [----] followers, [----] engagements
"ICCV reviewer invitation expires 2/1/2021 . now does that mean I missed it or that when addressing an international crowd the US date notation is very confusing"
[X Link](https://x.com/anyuser/status/1348757304259219457) 2021-01-11T22:22Z [----] followers, [--] engagements
"I know which model I am uploading to MLX community today ๐ https://t.co/5QagPC5fjx https://t.co/5QagPC5fjx"
[X Link](https://x.com/anyuser/status/1772679052849000714) 2024-03-26T17:36Z [----] followers, [----] engagements
"Did you know that clustered attention approximates a pretrained wav2vec on librispeech two times better than Performer's FAVOR Come talk to us at our #NeurIPS2020 poster in [--] hours to find out more With @angeloskath and @francoisfleuret we will present our work on fast transformers with clustering at #NeurIPS2020 on Thu @ 18:00 CET. Please visit our poster to know more. We will also answer questions on chat. Poster: https://t.co/o4eaP8n69P Project: https://t.co/QTekQFBq0k With @angeloskath and @francoisfleuret we will present our work on fast transformers with clustering at #NeurIPS2020 on"
[X Link](https://x.com/anyuser/status/1337049146952462337) 2020-12-10T14:58Z [----] followers, [--] engagements
"With @angeloskath and @francoisfleuret we will present our work on fast transformers with clustering at #NeurIPS2020 on Thu @ 18:00 CET. Please visit our poster to know more. We will also answer questions on chat. Poster: Project: http://clustered-transformers.github.io/ http://neurips.cc/virtual/2020/protected/poster_f6a8dd1c954c8506aadc764cc32b895e.html http://clustered-transformers.github.io/ http://neurips.cc/virtual/2020/protected/poster_f6a8dd1c954c8506aadc764cc32b895e.html"
[X Link](https://x.com/anyuser/status/1336430822086369282) 2020-12-08T22:01Z [---] followers, [--] engagements
"An interesting take-away from this work is that sequence-level routing works in real world tasks like question answering by routing based on a short prefix. This enables practically communication-free training of a traditional mixture of experts model and cheap inference. Dont have fast interconnect No problem No need for nodes to talk We introduce SMALLTALK LM an innovative method for training a mixture of language models (almost) asynchronously. SMALLTALK LM achieves better perplexity and better accuracy on a majority of downstream tasks https://t.co/ZWn0eQxzEe Dont have fast interconnect"
[X Link](https://x.com/anyuser/status/1844626390353838452) 2024-10-11T06:29Z [----] followers, [----] engagements
"Dont have fast interconnect No problem No need for nodes to talk We introduce SMALLTALK LM an innovative method for training a mixture of language models (almost) asynchronously. SMALLTALK LM achieves better perplexity and better accuracy on a majority of downstream tasks compared to a regular dense language model trained with the same amount of FLOPs. Paper: A joint work with amazing @angeloskath @GrangierDavid and Ronan Collobert. Let me explain how we did it -๐งต: 1/ https://arxiv.org/pdf/2410.03529 https://arxiv.org/pdf/2410.03529"
[X Link](https://x.com/anyuser/status/1844567591240511636) 2024-10-11T02:35Z [---] followers, [----] engagements
"I wish CMT had a negative tweet limit. Basically if your review fits in a tweet you shouldn't be able to submit it. #cvpr2020 #cvpr"
[X Link](https://x.com/anyuser/status/1223687481373200385) 2020-02-01T19:20Z [----] followers, [--] engagements
"To reproduce the video above first pip install -U mlx_lm and then python -m mlx_lm.generate --model mlx-community/ilsp-Meltemi-7B-Instruct-v1-4bit --prompt " ." --temp [---] --max-tokens [----] on any M-series Mac"
[X Link](https://x.com/anyuser/status/1772754916672786561) 2024-03-26T22:38Z [----] followers, [---] engagements
"Awesome work by a friend in @Oxford_VGG Watch people fighting on TV (we all like that right) without missing a single thing anybody says. Related publications: http://www.robots.ox.ac.uk/vgg/publications/2018/Afouras18b/afouras18b.pdf http://www.robots.ox.ac.uk/vgg/publications/2018/Afouras18/afouras18.pdf Can #AI modelling help people with hearing difficulties Discover how #OxfordAI could assist those with hearing difficulties by isolating voices in noisy environments: https://t.co/AmrE7QIoqw https://t.co/nzJAEuzKf9 http://www.robots.ox.ac.uk/vgg/publications/2018/Afouras18b/afouras18b.pdf"
[X Link](https://x.com/anyuser/status/1059132418156830720) 2018-11-04T17:17Z [----] followers, [--] engagements
"Can #AI modelling help people with hearing difficulties Discover how #OxfordAI could assist those with hearing difficulties by isolating voices in noisy environments: http://po.st/OxfordAI http://po.st/OxfordAI"
[X Link](https://x.com/anyuser/status/1058632622509703168) 2018-11-03T08:11Z 1M followers, [---] engagements
"What started in May is finalized in Greece's national elections yesterday. The far-right neo-fascist party did not make it in the greek parliament Hopefully the rest of Europe will follow. #ekloges19 #greekelections2019 #Europe https://x.com/angeloskath/status/1133354726530146306s=19 The definition of mixed feelings: When the far-right party of your country loses half their votes in [--] years and at the same time they will have [--] representatives in the european parliament because 4.9% is still too much. #EuropeanElectionResults #EUelections2019"
[X Link](https://x.com/anyuser/status/1148187590526459907) 2019-07-08T11:10Z [----] followers, [--] engagements
"The definition of mixed feelings: When the far-right party of your country loses half their votes in [--] years and at the same time they will have [--] representatives in the european parliament because 4.9% is still too much. #EuropeanElectionResults #EUelections2019"
[X Link](https://x.com/anyuser/status/1133354726530146306) 2019-05-28T12:49Z [----] followers, [--] engagements
"@francoisfleuret But we established that it cannot do this though"
[X Link](https://x.com/anyuser/status/1239476419652026368) 2020-03-16T08:59Z [----] followers, [--] engagements
"@Prince_Canuma There is but they need a bit of a refresh. I will see if it makes sense to make a small launch helper script that will prevent a bunch of "foot-guns" related to launching MPI jobs. https://ml-explore.github.io/mlx/build/html/usage/distributed.html https://ml-explore.github.io/mlx/build/html/usage/distributed.html"
[X Link](https://x.com/anyuser/status/1852122737361432795) 2024-10-31T22:57Z [----] followers, [---] engagements
"When we finished developing "Transformers are RNNs" we had planned to showcase it using music generation. We ended up not investing the necessary time but today I came across "Compound Word Transformer" and I love the generated music. Check it out https://ailabs.tw/human-interaction/compound-word-transformer-generate-pop-piano-music-of-full-song-length/ https://ailabs.tw/human-interaction/compound-word-transformer-generate-pop-piano-music-of-full-song-length/"
[X Link](https://x.com/anyuser/status/1348995164229005313) 2021-01-12T14:08Z [----] followers, [--] engagements
"This is very cool It has also been a few times I have wanted to shoutout at . @zcbenz has done an awesome job with the MLX bindings to Node.js. They are quite enjoyable to read with the proper amount of JS specific add-ons or quirks. https://github.com/frost-beta/node-mlx I wrote a CLI tool for semantic images search using Node.js and MLX. No third party API is used everything runs locally. Index is built of image embeddings with CLIP model and searching is just computing cosine similarities. https://t.co/QNuwkKX1x6 https://t.co/YNNgmetG9k https://github.com/frost-beta/node-mlx I wrote a CLI"
[X Link](https://x.com/anyuser/status/1835722216673300773) 2024-09-16T16:47Z [----] followers, [---] engagements
"I wrote a CLI tool for semantic images search using Node.js and MLX. No third party API is used everything runs locally. Index is built of image embeddings with CLIP model and searching is just computing cosine similarities. https://github.com/frost-beta/sisi https://github.com/frost-beta/sisi"
[X Link](https://x.com/anyuser/status/1835631064544620751) 2024-09-16T10:45Z [----] followers, [----] engagements
"@ruairiSpain This is all coming from the fused attention kernel which we just added. It may even have some space for further optimization in the next few days"
[X Link](https://x.com/anyuser/status/1847381368445288593) 2024-10-18T20:56Z [----] followers, [---] engagements
"Switzerland is not closing schools for #COVID19 because it would endanger grandparents who would take care of the children. Greece on the other hand pays for the vacation days of one of the two parents and closes all schools for [--] days. Switzerland man-up"
[X Link](https://x.com/anyuser/status/1237793499942195204) 2020-03-11T17:32Z [----] followers, [--] engagements
"Arxiv and code coming soon. One paper accepted at #ICML2019 with @angeloskath on attention-sampling with deep architectures to process megapixel images. One paper accepted at #ICML2019 with @angeloskath on attention-sampling with deep architectures to process megapixel images"
[X Link](https://x.com/anyuser/status/1120325504660320257) 2019-04-22T13:56Z [----] followers, [--] engagements
"One paper accepted at #ICML2019 with @angeloskath on attention-sampling with deep architectures to process megapixel images"
[X Link](https://x.com/anyuser/status/1120303069236006912) 2019-04-22T12:27Z 48.7K followers, [--] engagements
"@CVPRConf website is down but @paschalidoud_1 is also analog #CVPR2020"
[X Link](https://x.com/anyuser/status/1272941178409844740) 2020-06-16T17:16Z [----] followers, [--] engagements
"It is rare to see such a thorough and complete analysis nowadays. [--] awesome pages of appendix Any comparison/ablation you didn't know you wanted is probably there already. It was a privilege to see it being developed by Matteo David and Pierre. Stop discarding your old gradients Introducing AdEMAMix a novel (first-order) optimizer capable of outperforming Adam. Lets have a thread on momentum and the surprising relevance of very old gradients. A joint work with @GrangierDavid and @PierreAblin #ml #optimization 1/๐งต https://t.co/MbGVcSIPdg Stop discarding your old gradients Introducing"
[X Link](https://x.com/anyuser/status/1832179810124362118) 2024-09-06T22:11Z [----] followers, [----] engagements
"Stop discarding your old gradients Introducing AdEMAMix a novel (first-order) optimizer capable of outperforming Adam. Lets have a thread on momentum and the surprising relevance of very old gradients. A joint work with @GrangierDavid and @PierreAblin #ml #optimization 1/๐งต"
[X Link](https://x.com/anyuser/status/1832107984752951478) 2024-09-06T17:25Z [---] followers, 67.1K engagements
"Effort led by @awnihannun and I have to say that readability and development did not suffer at all to enable this new capability. If anything kernel instantiation (using the preprocessor) and kernel definition are now nicely separated"
[X Link](https://x.com/anyuser/status/1794062378252005458) 2024-05-24T17:46Z [----] followers, [---] engagements
"@lucidrains @apoorv2904 @SmallerNNsPls @francoisfleuret @trees_random @icmlconf @nik0spapp @Idiap_ch @EPFL Thanks for the interest Indeed. However the main benefit of our work is the derivation of a formulation that allows to write an autoregressive transformer as an RNN; thus resulting in orders of magnitude speed up during inference. (we really need to speed up the preprint :-))"
[X Link](https://x.com/anyuser/status/1267939850717667328) 2020-06-02T22:03Z [----] followers, [--] engagements
"Congrats to all the researchers from ILSP and Athena research center that worked on this I couldn't find twitter handles to tag people so please let me know if I should be tagging someone"
[X Link](https://x.com/anyuser/status/1772754917968773405) 2024-03-26T22:38Z [----] followers, [---] engagements
"@caviterginsoy @ivanfioravanti @alex_barron1 @DiganiJagrit Neglected a bit for sure but what about ๐ https://github.com/ml-explore/mlx-data https://github.com/ml-explore/mlx-data"
[X Link](https://x.com/anyuser/status/1860076969700036998) 2024-11-22T21:44Z [----] followers, [--] engagements
"@SmallerNNsPls @francoisfleuret @trees_random @icmlconf @apoorv2904 @nik0spapp @Idiap_ch @EPFL Yes they are normalized as follows (Q) (K)' V / (sum_i (Q) (K)_i). You have to assume some broadcasting semantics in the above equation due to twitter"
[X Link](https://x.com/anyuser/status/1267592538791215104) 2020-06-01T23:03Z [----] followers, [--] engagements
"@francoisfleuret Easy. Woodworker or blacksmith or both. Making tools to make tools to make tools is still one of the big joys of life"
[X Link](https://x.com/anyuser/status/1246145306808520705) 2020-04-03T18:39Z [----] followers, [--] engagements
"So. @github you implement code search but decide to ignore . : ; / ' " = * # $ & + ( ) I am having fun searching for function definitions/implementations without being able to use "func(" or "::func""
[X Link](https://x.com/anyuser/status/1159089129390366721) 2019-08-07T13:09Z [----] followers, [--] engagements
"@ykilcher @_florianmai @jiangelaa @zacharylipton @francoisfleuret If you are looking for an intuitive explanation regarding why these methods don't help much on hard datasets (the question raised in the video) they rely on the existence of uninformative datapoints. In Imagenet there are none for most of the training"
[X Link](https://x.com/anyuser/status/1181243450253090818) 2019-10-07T16:22Z [----] followers, [--] engagements
"Oh and the model definition looks even more familiar"
[X Link](https://x.com/anyuser/status/1764876372189859872) 2024-03-05T04:51Z [----] followers, [---] engagements
"@demirbasayyuce @awnihannun Well actually I dont think you need any of that due to unified memory. Quantizing the Lora example in mlx should work out of the box. Havent tried it yet but I dont see why not"
[X Link](https://x.com/anyuser/status/1738008676139745355) 2023-12-22T01:28Z [----] followers, [---] engagements
"Usually I adore @PyTorch software engineering but going from v1.5.0 to v1.6.0 breaks at::detail::getDefaultCPUGenerator() which breaks some C++ extensions. Shouldn't that be in the release notes"
[X Link](https://x.com/anyuser/status/1298672752727973890) 2020-08-26T17:24Z [----] followers, [--] engagements
"@unixpickle Our convs (especially the backward) do need some love. We are working on it and they will be faster soon ๐"
[X Link](https://x.com/anyuser/status/1859751519643632052) 2024-11-22T00:11Z [----] followers, [---] engagements
"@Prince_Canuma Here is the PR It's been there months ๐
we 'll merge it shortly. TL;DR: The mlx_lm.lora supports distributed finetuning. All you have to do is launch it with mpirun . https://github.com/ml-explore/mlx-examples/pull/821 https://github.com/ml-explore/mlx-examples/pull/821"
[X Link](https://x.com/anyuser/status/1852128822436729241) 2024-10-31T23:21Z [----] followers, [---] engagements
"@unixpickle Indeed but isnt this like 90% of fine-tuning runs Ymmv but for a 14B even 200M parameters LoRA scales very reasonably via Ethernet. Something like 5x speed up for [--] nodes"
[X Link](https://x.com/anyuser/status/1852211131307237426) 2024-11-01T04:48Z [----] followers, [--] engagements
"@Prince_Canuma Each machine has the full model ie this is simple data parallel training. Given that the Macs have comparatively very high memory capacity per FLOP ratio I think it is also the best strategy. Combined with grad accumulation and checkpointing you can fine-tune almost anything"
[X Link](https://x.com/anyuser/status/1852132990702490002) 2024-10-31T23:38Z [----] followers, [---] engagements
"At these speeds who needs RAG am I right In the latest MLX generating with long prompts is much faster with KV quantization. Thanks to @alex_barron1. 4-bit Llama 8B with a [-----] token prompt + 8-bit KV cache generates at [--] toks/sec on an M2 Ultra: https://t.co/TlBZTOhpfD In the latest MLX generating with long prompts is much faster with KV quantization. Thanks to @alex_barron1. 4-bit Llama 8B with a [-----] token prompt + 8-bit KV cache generates at [--] toks/sec on an M2 Ultra: https://t.co/TlBZTOhpfD"
[X Link](https://x.com/anyuser/status/1854284083662324148) 2024-11-06T22:05Z [----] followers, [---] engagements
"In the latest MLX generating with long prompts is much faster with KV quantization. Thanks to @alex_barron1. 4-bit Llama 8B with a [-----] token prompt + 8-bit KV cache generates at [--] toks/sec on an M2 Ultra:"
[X Link](https://x.com/anyuser/status/1854195663560679613) 2024-11-06T16:14Z 37.8K followers, [----] engagements
"@CSProfKGD @CVPR This is great Could not think of anybody better for this :-)"
[X Link](https://x.com/anyuser/status/1381534041955053575) 2021-04-12T09:05Z [----] followers, [--] engagements
"@KassinosS @awnihannun Out of curiosity how would a simple relu MLP that passes the inputs through a simple sinusoidal positional encoding do in that problem In my experience they are a pretty good baseline for any such function approximation. See for examples of what I mean. https://bmild.github.io/fourfeat/ https://bmild.github.io/fourfeat/"
[X Link](https://x.com/anyuser/status/1749738202511089668) 2024-01-23T10:17Z [----] followers, [---] engagements
"Yay Our awesome group is growing I have two open phd positions in my group at @Idiap_ch / @EPFL_en Both in deep learning one in computer vision to combine multi-sensors for scene reconstruction and the other for weather forecast and air traffic control. https://t.co/wtRcjpc4Vl I have two open phd positions in my group at @Idiap_ch / @EPFL_en Both in deep learning one in computer vision to combine multi-sensors for scene reconstruction and the other for weather forecast and air traffic control. https://t.co/wtRcjpc4Vl"
[X Link](https://x.com/anyuser/status/1152466320769933312) 2019-07-20T06:32Z [----] followers, [--] engagements
"I have two open phd positions in my group at @Idiap_ch / @EPFL_en Both in deep learning one in computer vision to combine multi-sensors for scene reconstruction and the other for weather forecast and air traffic control. https://www.idiap.ch/fleuret/hiring.html https://www.idiap.ch/fleuret/hiring.html"
[X Link](https://x.com/anyuser/status/1152249843898880001) 2019-07-19T16:12Z 48.7K followers, [---] engagements
"@WankyuChoi I am super happy you picked it up ๐. I actually added it to the example after seeing your previous demo and comments. Great video as always"
[X Link](https://x.com/anyuser/status/1741224588603007004) 2023-12-30T22:27Z [----] followers, [---] engagements
"@francoisfleuret Advice for people looking for a career: learn software engineering"
[X Link](https://x.com/anyuser/status/1235108805089914880) 2020-03-04T07:44Z [----] followers, [--] engagements
"@ivanfioravanti @emrekoctw @awnihannun The UNet and text encoders should be fine as they only need about 4GB when quantized. The decoder otoh needs more. The trick there is to apply the decoder in a tiling fashion but I am not 100% sure it will be straightforward"
[X Link](https://x.com/anyuser/status/1766857538270880051) 2024-03-10T16:03Z [----] followers, [---] engagements
"@priontific This is possible in all frameworks. The only thing that changes is simplicity and efficiency. Eg I am pretty sure that in PyTorch one would have to manually unload layers mid computation. Simply put with MLX the efficiency is great and I dont think it gets any simpler"
[X Link](https://x.com/anyuser/status/1831959316775211294) 2024-09-06T07:35Z [----] followers, [--] engagements
"@ivanfioravanti Looking at that you should probably also increase the batch size. It will help amortize the communication cost. Here each machine deals with [--] sequences I would do [--] per machine meaning [--] total"
[X Link](https://x.com/anyuser/status/1853186254944477679) 2024-11-03T21:23Z [----] followers, [--] engagements
"Fantastic work by @NasFilippova before even starting her PhD ๐"
[X Link](https://x.com/anyuser/status/1844626391612129537) 2024-10-11T06:29Z [----] followers, [---] engagements
"@dimadamen @ducha_aiki Oops sorry if it was perceived as whining mostly meant as a joke ๐. Thanks a lot for the reply and taking it into account for the future"
[X Link](https://x.com/anyuser/status/1348919550964858880) 2021-01-12T09:07Z [----] followers, [--] engagements
"@unixpickle @gazorp5 @awnihannun It would be quite an architectural change I believe to have unified memory in either of the two. It is not as simple as making a backend since the operations need to synchronize but not copy even though they may run on GPU or CPU"
[X Link](https://x.com/anyuser/status/1732245510575304747) 2023-12-06T03:48Z [----] followers, [---] engagements
"@awnihannun @priontific It is also the schnell model which is significantly harder to fine tune"
[X Link](https://x.com/anyuser/status/1845185145772585346) 2024-10-12T19:29Z [----] followers, [---] engagements
"@unixpickle @gazorp5 @awnihannun Moreover designing a backend would mean we inherit all the negative aspects of these frameworks whether they are shape based compilation or eager computation or something else"
[X Link](https://x.com/anyuser/status/1732245595698635028) 2023-12-06T03:48Z [----] followers, [---] engagements
"@ducha_aiki @francoisfleuret @apoorv2904 @nik0spapp @jb_cordonnier Well not instead of self-attention but you could look at that uses a similar mechanism with completely data independent values to replace fully connected layers. https://arxiv.org/abs/1907.05242 https://arxiv.org/abs/1907.05242"
[X Link](https://x.com/anyuser/status/1282639111837409281) 2020-07-13T11:32Z [----] followers, [--] engagements
"@lucidrains @apoorv2904 @SmallerNNsPls @francoisfleuret @trees_random @icmlconf @nik0spapp @Idiap_ch @EPFL In pseudocode yes. In practice this requires N times more memory than necessary so we opt for a custom CUDA kernel. During inference this is kept as the state so one only needs the last value of the cumsum anyway (so no custom kernels necessary)"
[X Link](https://x.com/anyuser/status/1267949404729806850) 2020-06-02T22:41Z [----] followers, [--] engagements
"@Prince_Canuma Yeah of course you can implement it. But fyi pipeline training is significantly more complicated than pipeline inference (and harder to get linear scaling as well)"
[X Link](https://x.com/anyuser/status/1852146572789956937) 2024-11-01T00:32Z [----] followers, [--] engagements
"@andriy_mulyar @_joaogui1 @pragmaticml Besides the custom kernels I think the jax implementation of linear attention is a bit off. In theory it should be identical to performers without the feature map so *at least* as fast. In our implementation it is 2-3 times faster than FAVOR with [---] dims"
[X Link](https://x.com/anyuser/status/1326023734340292608) 2020-11-10T04:47Z [----] followers, [--] engagements
"@Suuraj @francoisfleuret @pafrossard @AlexAlahi @_beenkim @LudovicDenoyer Congratulations to both you guys Well deserved ๐ฅณ๐ฅณ๐ฅณ"
[X Link](https://x.com/anyuser/status/1472151937768464387) 2021-12-18T10:29Z [----] followers, [--] engagements
"@ivanfioravanti Thats exciting ๐. Ethernet thunderbolt or WiFi"
[X Link](https://x.com/anyuser/status/1853163042135068947) 2024-11-03T19:51Z [----] followers, [--] engagements
"@dave_andersen @ykilcher @_florianmai @jiangelaa @zacharylipton @francoisfleuret It's my bad for posting it without more context. It is the empirical variance of the mini-batch gradient under different sampling distributions. Namely we sample mini-batches compute the grad and compare the norm of the diff with the average gradient"
[X Link](https://x.com/anyuser/status/1182431144752623617) 2019-10-10T23:02Z [----] followers, [--] engagements
"@priontific @MePavelKral @awnihannun Hmm try to not generate progress images it saves some memory (it shouldn't but it does will look into it). For 512x512 batch size [--] I am getting 49GB so it should be doable without swap. Agreed though QLoRA (and/or checkpointing) is gonna make a huge difference"
[X Link](https://x.com/anyuser/status/1844927281325572454) 2024-10-12T02:25Z [----] followers, [---] engagements
"@walkfourmore You can fine tune it using LoRA on your laptop (see the MLX examples). An 8GB MacBook Air wont break any speed records but you can easily fine tune it on your data over night if they are about a book long"
[X Link](https://x.com/anyuser/status/1774190727863664886) 2024-03-30T21:43Z [----] followers, [--] engagements
"I know I probably shouldn't be using in my code but keras definitely shouldn't be using 'from tensorflow_backend import *' . http://K.tf http://K.tf"
[X Link](https://x.com/anyuser/status/1166682246469828609) 2019-08-28T12:01Z [----] followers, [--] engagements
"@MePavelKral @awnihannun Its currently around 50GB but there are many things left to do to get this down to less than [--]. More PRs coming โบ"
[X Link](https://x.com/anyuser/status/1844815397675213161) 2024-10-11T19:00Z [----] followers, [--] engagements
"@chriswolfvision @francoisfleuret Well I think broadcasting is great The problem is with implicit expand_dims. Who thought that it was a good idea to implicitly resize tensors so that the dims work Under that reasoning all element wise operations are possible by expanding enough times both tensors"
[X Link](https://x.com/anyuser/status/1346506635380989952) 2021-01-05T17:19Z [----] followers, [--] engagements
"@Prince_Canuma Indeed. It is also the case for flux finetuning btw"
[X Link](https://x.com/anyuser/status/1852131124224700669) 2024-10-31T23:30Z [----] followers, [--] engagements
"Removing a public member from a python module is a backwards incompatible change and should incur a major version change. Looking at you keras.backend . that you no longer provide tf moving from v2.2.4 to v2.2.5 . @fchollet"
[X Link](https://x.com/anyuser/status/1166682244922126344) 2019-08-28T12:01Z [----] followers, [--] engagements
"@ivanfioravanti Hm feel free to drop an issue if you think that there is something to be fixed"
[X Link](https://x.com/anyuser/status/1908187050945437733) 2025-04-04T15:57Z [----] followers, [---] engagements
"@dave_andersen @ykilcher @_florianmai @jiangelaa @zacharylipton @francoisfleuret Specifically if we consider the gradient norm as an indicator on whether a sample is informative we see that for Imagenet the distribution of the norms is much closer to uniform (hence we cannot reduce the variance as depicted)"
[X Link](https://x.com/anyuser/status/1182417189623750657) 2019-10-10T22:06Z [----] followers, [--] engagements
Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing
@angeloskath Angelos KatharopoulosAngelos Katharopoulos posts on X about model, code, ultra, faster the most. They currently have [-----] followers and [---] posts still getting attention that total [-----] engagements in the last [--] hours.
Social category influence products social networks technology brands fashion brands stocks cryptocurrencies finance countries
Social topic influence model, code, ultra, faster, apple, pip, in the, attention, swift, inference
Top accounts mentioned or mentioned by @awnihannun @francoisfleuret @ivanfioravanti @princecanuma @idiapch @apoorv2904 @unixpickle @epflen @priontific @nik0spapp @alexbarron1 @walkfourmore @grangierdavid @mepavelkral @diganijagrit @icmlconf @ylecun @pierreablin @neuripsconf @smallernnspls
Top posts by engagements in the last [--] hours
"I am really excited about our latest work A simple efficient framework to experiment with modern neural networks even on your laptop [--] lines to write a transformer LM ๐ฅณ Just in time for the holidays we are releasing some new software today from Apple machine learning research. MLX is an efficient machine learning framework specifically designed for Apple silicon (i.e. your laptop) Code: https://t.co/Kbis7IrP80 Docs: https://t.co/CUQb80HGut Just in time for the holidays we are releasing some new software today from Apple machine learning research. MLX is an efficient machine learning"
X Link 2023-12-06T07:43Z [----] followers, 35.2K engagements
"How about your personal chat GPT on your M2 Ultra Amazing model by Mistral AI and [--] day to implement it in MLX. Mixtral 8x7B in MLX https://t.co/wh4PmlQltK Runs on an M2 Ultra ๐ข๐ข https://t.co/AFeEKyvmUu Mixtral 8x7B in MLX https://t.co/wh4PmlQltK Runs on an M2 Ultra ๐ข๐ข https://t.co/AFeEKyvmUu"
X Link 2023-12-12T19:16Z [----] followers, [----] engagements
"@demirbasayyuce @awnihannun Well actually I dont think you need any of that due to unified memory. Quantizing the Lora example in mlx should work out of the box. Havent tried it yet but I dont see why not"
X Link 2023-12-22T01:28Z [----] followers, [---] engagements
"We implemented quantization from scratch in a week. I think that is one of the biggest strengths of MLX. Easy to use but also easy to extend and customize. We cant wait to see what people will implement in a month Big update to MLX but especially ๐ฅ N-bit quantization and quantized matmul kernels Thanks to the wizardry of @angeloskath pip install -U mlx https://t.co/YcUhRv31TP Big update to MLX but especially ๐ฅ N-bit quantization and quantized matmul kernels Thanks to the wizardry of @angeloskath pip install -U mlx https://t.co/YcUhRv31TP"
X Link 2023-12-22T05:55Z [----] followers, 27.1K engagements
"@WankyuChoi I am super happy you picked it up ๐. I actually added it to the example after seeing your previous demo and comments. Great video as always"
X Link 2023-12-30T22:27Z [----] followers, [---] engagements
"@KassinosS @awnihannun Out of curiosity how would a simple relu MLP that passes the inputs through a simple sinusoidal positional encoding do in that problem In my experience they are a pretty good baseline for any such function approximation. See for examples of what I mean. https://bmild.github.io/fourfeat/ https://bmild.github.io/fourfeat/"
X Link 2024-01-23T10:17Z [----] followers, [---] engagements
"Looking back at all the amazing things people built with MLX in a couple of months I am incredibly excited to see the things that will be built now in a familiar dev environment in Swift Just [--] lines of code to write a general multi-head attention in MLX Swift ๐๐๐ As part of our goal to make MLX a great research tool we're expanding support to new languages like Swift and C making experimentation on Apple silicon easier for ML researchers. Video generating text with Mistral 7B and MLX Swift ๐ MLX is an array framework for machine https://t.co/nKJcgqePyr As part of our goal to make MLX a"
X Link 2024-02-20T22:20Z [----] followers, 11.7K engagements
"What I find even cooler than training on an iPhone is that it is done with just [--] lines of code that are super readable and very familiar to anyone that writes training loops in python. Let's go MLX Swift ๐๐๐ Using MLX Swift to train LeNet on MNIST. Takes less than a minute on my iPhone [--]. Example here: https://t.co/lQs6mECoIK @ylecun long-live MNIST https://t.co/T5eSdcDBM3 Using MLX Swift to train LeNet on MNIST. Takes less than a minute on my iPhone [--]. Example here: https://t.co/lQs6mECoIK @ylecun long-live MNIST https://t.co/T5eSdcDBM3"
X Link 2024-03-05T04:51Z [----] followers, 19.3K engagements
"Oh and the model definition looks even more familiar"
X Link 2024-03-05T04:51Z [----] followers, [---] engagements
"This is too cool. Now let's combine it with a TTS model and have it tell us nice stories while looking at the beautiful lake. Apple MLX on Vision Pro YES YOU CAN BOOM Here the raw video of MLX Swift LLMEval example running natively on the device Thanks @awnihannun ๐ ๐ฅ๐ฅ๐ฅ #VisionPro #LLM #Apple https://t.co/35T960KopQ Apple MLX on Vision Pro YES YOU CAN BOOM Here the raw video of MLX Swift LLMEval example running natively on the device Thanks @awnihannun ๐ ๐ฅ๐ฅ๐ฅ #VisionPro #LLM #Apple https://t.co/35T960KopQ"
X Link 2024-03-20T18:21Z [----] followers, [----] engagements
"I know which model I am uploading to MLX community today ๐ https://t.co/5QagPC5fjx https://t.co/5QagPC5fjx"
X Link 2024-03-26T17:36Z [----] followers, [----] engagements
"For the native Greek speakers you can already interact with Meltemi on your laptop directly from HF using MLX. I also uploaded a quantized 4-bit version on mlx-community for faster inference. Almost [--] tokens per second on a MacBook Air and [--] on an M2 Ultra https://t.co/5QagPC5fjx https://t.co/5QagPC5fjx"
X Link 2024-03-26T22:38Z [----] followers, [----] engagements
"To reproduce the video above first pip install -U mlx_lm and then python -m mlx_lm.generate --model mlx-community/ilsp-Meltemi-7B-Instruct-v1-4bit --prompt " ." --temp [---] --max-tokens [----] on any M-series Mac"
X Link 2024-03-26T22:38Z [----] followers, [---] engagements
"Congrats to all the researchers from ILSP and Athena research center that worked on this I couldn't find twitter handles to tag people so please let me know if I should be tagging someone"
X Link 2024-03-26T22:38Z [----] followers, [---] engagements
"@walkfourmore Hmm just ran it on an 8GB M1 Mac mini (same chip) and it gets a very respectable [--] tps. Feel free to file an issue on GitHub with details to help you debug your setup. Otherwise doing a fresh install on a new python environment should probably be enough"
X Link 2024-03-29T23:32Z [----] followers, [--] engagements
"@walkfourmore https://github.com/ml-explore/mlx-examples https://github.com/ml-explore/mlx https://github.com/ml-explore/mlx-examples https://github.com/ml-explore/mlx"
X Link 2024-03-30T04:54Z [----] followers, [--] engagements
"@walkfourmore You can fine tune it using LoRA on your laptop (see the MLX examples). An 8GB MacBook Air wont break any speed records but you can easily fine tune it on your data over night if they are about a book long"
X Link 2024-03-30T21:43Z [----] followers, [--] engagements
"I have to say it because @awnihannun is quick to give credit to others but doesnt take much for himself. This performance improvement largely comes from his relentless hunting down of every kind of overhead in MLX the past weeks. Kudos MLX [----] [----] faster generation across model sizes and machines. tokens-per-second for 4-bit models: https://t.co/o5f3WmVzGZ MLX [----] [----] faster generation across model sizes and machines. tokens-per-second for 4-bit models: https://t.co/o5f3WmVzGZ"
X Link 2024-04-20T19:51Z [----] followers, 17.8K engagements
"My favorite new addition to MLX in v0.14 is the option to just-in-time compile the kernels in order to create a small binary to ease deployment More than 10x reduction to the Metal library size. https://ml-explore.github.io/mlx/build/html/install.html#binary-size-minimization pip install -U mlx https://t.co/CJYcIRJr9J https://ml-explore.github.io/mlx/build/html/install.html#binary-size-minimization pip install -U mlx https://t.co/CJYcIRJr9J"
X Link 2024-05-24T17:46Z [----] followers, [----] engagements
"Effort led by @awnihannun and I have to say that readability and development did not suffer at all to enable this new capability. If anything kernel instantiation (using the preprocessor) and kernel definition are now nicely separated"
X Link 2024-05-24T17:46Z [----] followers, [---] engagements
"If you are at #ICML2024 last chance today to play with MLX on the M2 Ultra or iPad at the Apple booth If you are a fan already come tell us the cool things you built"
X Link 2024-07-24T08:10Z [----] followers, [----] engagements
"Love this I think the barrier to entry for playing with LLMs doesnt get any lower Made a minimal but fast example of text generation with Llama [---] in MLX. Minimal: [--] file [---] lines of simple code [--] dependencies Fast: 100+ toks/sec with 4-bit 8B on M2 Ultra Code: https://t.co/aCfcEG7geA https://t.co/mw9PhLilBn Made a minimal but fast example of text generation with Llama [---] in MLX. Minimal: [--] file [---] lines of simple code [--] dependencies Fast: 100+ toks/sec with 4-bit 8B on M2 Ultra Code: https://t.co/aCfcEG7geA https://t.co/mw9PhLilBn"
X Link 2024-08-09T18:43Z [----] followers, [----] engagements
"It is quite easy to underestimate how hard that is to do in any other framework Lazy evaluation in MLX is a very cool feature but takes some getting used to. Here's one example where it is useful: for low RAM devices lazy loading/computation makes it almost trivial to stream a model from disk without ever materializing the full thing. https://t.co/nodT6gLYza Lazy evaluation in MLX is a very cool feature but takes some getting used to. Here's one example where it is useful: for low RAM devices lazy loading/computation makes it almost trivial to stream a model from disk without ever"
X Link 2024-09-05T21:35Z [----] followers, [----] engagements
"@priontific This is possible in all frameworks. The only thing that changes is simplicity and efficiency. Eg I am pretty sure that in PyTorch one would have to manually unload layers mid computation. Simply put with MLX the efficiency is great and I dont think it gets any simpler"
X Link 2024-09-06T07:35Z [----] followers, [--] engagements
"It is rare to see such a thorough and complete analysis nowadays. [--] awesome pages of appendix Any comparison/ablation you didn't know you wanted is probably there already. It was a privilege to see it being developed by Matteo David and Pierre. Stop discarding your old gradients Introducing AdEMAMix a novel (first-order) optimizer capable of outperforming Adam. Lets have a thread on momentum and the surprising relevance of very old gradients. A joint work with @GrangierDavid and @PierreAblin #ml #optimization 1/๐งต https://t.co/MbGVcSIPdg Stop discarding your old gradients Introducing"
X Link 2024-09-06T22:11Z [----] followers, [----] engagements
"This is very cool It has also been a few times I have wanted to shoutout at . @zcbenz has done an awesome job with the MLX bindings to Node.js. They are quite enjoyable to read with the proper amount of JS specific add-ons or quirks. https://github.com/frost-beta/node-mlx I wrote a CLI tool for semantic images search using Node.js and MLX. No third party API is used everything runs locally. Index is built of image embeddings with CLIP model and searching is just computing cosine similarities. https://t.co/QNuwkKX1x6 https://t.co/YNNgmetG9k https://github.com/frost-beta/node-mlx I wrote a CLI"
X Link 2024-09-16T16:47Z [----] followers, [---] engagements
"An interesting take-away from this work is that sequence-level routing works in real world tasks like question answering by routing based on a short prefix. This enables practically communication-free training of a traditional mixture of experts model and cheap inference. Dont have fast interconnect No problem No need for nodes to talk We introduce SMALLTALK LM an innovative method for training a mixture of language models (almost) asynchronously. SMALLTALK LM achieves better perplexity and better accuracy on a majority of downstream tasks https://t.co/ZWn0eQxzEe Dont have fast interconnect"
X Link 2024-10-11T06:29Z [----] followers, [----] engagements
"Fantastic work by @NasFilippova before even starting her PhD ๐"
X Link 2024-10-11T06:29Z [----] followers, [---] engagements
"@MePavelKral @awnihannun Its currently around 50GB but there are many things left to do to get this down to less than [--]. More PRs coming โบ"
X Link 2024-10-11T19:00Z [----] followers, [--] engagements
"@priontific @MePavelKral @awnihannun Hmm try to not generate progress images it saves some memory (it shouldn't but it does will look into it). For 512x512 batch size [--] I am getting 49GB so it should be doable without swap. Agreed though QLoRA (and/or checkpointing) is gonna make a huge difference"
X Link 2024-10-12T02:25Z [----] followers, [---] engagements
"Now merged with a small README to help people get started. Enjoy Clear your weekend for some FLUX LoRA fine-tuning in MLX thanks to @angeloskath https://t.co/PeM8vnHoZt Clear your weekend for some FLUX LoRA fine-tuning in MLX thanks to @angeloskath https://t.co/PeM8vnHoZt"
X Link 2024-10-12T04:25Z [----] followers, [----] engagements
"@awnihannun @priontific It is also the schnell model which is significantly harder to fine tune"
X Link 2024-10-12T19:29Z [----] followers, [---] engagements
"In the new MLX release we break the 110tps barrier out of the box for Mistral 7B Here is the speed (before and after) of a bunch of LMs writing a story about Einstein on an M2 Ultra"
X Link 2024-10-18T20:01Z [----] followers, [----] engagements
"@ruairiSpain This is all coming from the fused attention kernel which we just added. It may even have some space for further optimization in the next few days"
X Link 2024-10-18T20:56Z [----] followers, [---] engagements
"It has been some time since we added distributed support to MLX. Even via 10Gbps Ethernet distributed LoRA scales perfectly. I can't wait for people to train large adapters via thunderbolt or small ones over Wi-Fi. There is so much compute lying around (nodes are M2 Ultras)"
X Link 2024-10-31T21:15Z [----] followers, [----] engagements
"@Prince_Canuma There is but they need a bit of a refresh. I will see if it makes sense to make a small launch helper script that will prevent a bunch of "foot-guns" related to launching MPI jobs. https://ml-explore.github.io/mlx/build/html/usage/distributed.html https://ml-explore.github.io/mlx/build/html/usage/distributed.html"
X Link 2024-10-31T22:57Z [----] followers, [---] engagements
"@Prince_Canuma Here is the PR It's been there months ๐
we 'll merge it shortly. TL;DR: The mlx_lm.lora supports distributed finetuning. All you have to do is launch it with mpirun . https://github.com/ml-explore/mlx-examples/pull/821 https://github.com/ml-explore/mlx-examples/pull/821"
X Link 2024-10-31T23:21Z [----] followers, [---] engagements
"@Prince_Canuma Indeed. It is also the case for flux finetuning btw"
X Link 2024-10-31T23:30Z [----] followers, [--] engagements
"@Prince_Canuma Each machine has the full model ie this is simple data parallel training. Given that the Macs have comparatively very high memory capacity per FLOP ratio I think it is also the best strategy. Combined with grad accumulation and checkpointing you can fine-tune almost anything"
X Link 2024-10-31T23:38Z [----] followers, [---] engagements
"@Prince_Canuma Yeah of course you can implement it. But fyi pipeline training is significantly more complicated than pipeline inference (and harder to get linear scaling as well)"
X Link 2024-11-01T00:32Z [----] followers, [--] engagements
"@unixpickle Indeed but isnt this like 90% of fine-tuning runs Ymmv but for a 14B even 200M parameters LoRA scales very reasonably via Ethernet. Something like 5x speed up for [--] nodes"
X Link 2024-11-01T04:48Z [----] followers, [--] engagements
"@ivanfioravanti Thats exciting ๐. Ethernet thunderbolt or WiFi"
X Link 2024-11-03T19:51Z [----] followers, [--] engagements
"@ivanfioravanti Looking at that you should probably also increase the batch size. It will help amortize the communication cost. Here each machine deals with [--] sequences I would do [--] per machine meaning [--] total"
X Link 2024-11-03T21:23Z [----] followers, [--] engagements
"At these speeds who needs RAG am I right In the latest MLX generating with long prompts is much faster with KV quantization. Thanks to @alex_barron1. 4-bit Llama 8B with a [-----] token prompt + 8-bit KV cache generates at [--] toks/sec on an M2 Ultra: https://t.co/TlBZTOhpfD In the latest MLX generating with long prompts is much faster with KV quantization. Thanks to @alex_barron1. 4-bit Llama 8B with a [-----] token prompt + 8-bit KV cache generates at [--] toks/sec on an M2 Ultra: https://t.co/TlBZTOhpfD"
X Link 2024-11-06T22:05Z [----] followers, [---] engagements
"@unixpickle Our convs (especially the backward) do need some love. We are working on it and they will be faster soon ๐"
X Link 2024-11-22T00:11Z [----] followers, [---] engagements
"@caviterginsoy @ivanfioravanti @alex_barron1 @DiganiJagrit Neglected a bit for sure but what about ๐ https://github.com/ml-explore/mlx-data https://github.com/ml-explore/mlx-data"
X Link 2024-11-22T21:44Z [----] followers, [--] engagements
"@ivanfioravanti Hm feel free to drop an issue if you think that there is something to be fixed"
X Link 2025-04-04T15:57Z [----] followers, [---] engagements
"@DiganiJagrit has made awesome building blocks to write matmul kernels in MLX. It would be so much more work to write this otherwise and it would also be slower. Latest mlx + mlx-lm have much faster prompt processing speeds for MoEs. Thanks to some magic from @angeloskath pip install -U mlx-lm Deep Seek v3 up to 2x faster Mixtral up to 3.5x faster Llama [--] up to 2x faster Latest mlx + mlx-lm have much faster prompt processing speeds for MoEs. Thanks to some magic from @angeloskath pip install -U mlx-lm Deep Seek v3 up to 2x faster Mixtral up to 3.5x faster Llama [--] up to 2x faster"
X Link 2025-04-18T21:18Z [----] followers, [----] engagements
"Kudos to David Koski From discussion to merged PR in [--] days. You can't beat open source We heard developer feedback on Monday that the MLX Swift LLM API is hard to get started with. So we went to work. We made a new and improved streamlined API. Now [--] lines toload an LLM or VLM and start a chat session in a Swift project. Happy last day of #WWDC25 https://t.co/NA00RJqjVR We heard developer feedback on Monday that the MLX Swift LLM API is hard to get started with. So we went to work. We made a new and improved streamlined API. Now [--] lines toload an LLM or VLM and start a chat session in a"
X Link 2025-06-13T18:39Z [----] followers, [----] engagements
"@awnihannun added batched generation to MLX-LM [--] months ago. Everybody since has been asking for batching in the MLX-LM server. Well enjoy the first version in the latest MLX-LM release. The following video is serving [--] consecutive requests for Qwen3 30B on an M2 Ultra"
X Link 2025-12-03T23:42Z [----] followers, 14.2K engagements
"@RickRossTN @awnihannun Nice It depends on the model as well. MoEs will hit more experts when batched so they will scale a little worse than dense models"
X Link 2025-12-09T23:33Z [----] followers, [--] engagements
"@ivanfioravanti @awnihannun ๐ Let us know how it goes Deepseek V3.2 won't work with chat cause the chat template is missing. I changed it in the docs sorry for the large download"
X Link 2025-12-17T19:22Z [----] followers, [---] engagements
"Low latency communication is crucial for tensor parallel inference which is now available on the latest mlx-lm (not on pypi yet). In the following video Devstral is generating a quicksort in C++ 1.7x faster on [--] M3 Ultras (right) vs on [--] (left). The latest MLX is out And ithas a new distributed back-end (JACCL) that uses RDMA over TB5 for super low-latency communication across multiple Macs. Thanks to @angeloskath https://t.co/254dMxND9W The latest MLX is out And ithas a new distributed back-end (JACCL) that uses RDMA over TB5 for super low-latency communication across multiple Macs. Thanks"
X Link 2025-12-18T19:40Z [----] followers, [----] engagements
"Latest MLX release on macOS [----] has an updated JACCL with significantly increased distributed bandwidth. (Merged PR This results in great prompt processing scaling and reduced time to first token 3.3x speedup for both MoEs and dense models (4 nodes). https://github.com/ml-explore/mlx/pull/3094 https://github.com/ml-explore/mlx/pull/3094"
X Link 2026-02-07T02:55Z [----] followers, 18.7K engagements
"Moving the latest LLM from one Mac to the other can now be done much faster using mlx distributed. SSD write speed is now the bottleneck at around [---] GB/s. Here broadcasting M2.5 8bit in 45s to [--] other Macs. Needs macOS [----] and mlx-lm main not yet on PyPi"
X Link 2026-02-14T02:44Z [----] followers, [----] engagements
"I am really excited about our latest work A simple efficient framework to experiment with modern neural networks even on your laptop [--] lines to write a transformer LM ๐ฅณ Just in time for the holidays we are releasing some new software today from Apple machine learning research. MLX is an efficient machine learning framework specifically designed for Apple silicon (i.e. your laptop) Code: https://t.co/Kbis7IrP80 Docs: https://t.co/CUQb80HGut Just in time for the holidays we are releasing some new software today from Apple machine learning research. MLX is an efficient machine learning"
X Link 2023-12-06T07:43Z [----] followers, 35.2K engagements
"Just in time for the holidays we are releasing some new software today from Apple machine learning research. MLX is an efficient machine learning framework specifically designed for Apple silicon (i.e. your laptop) Code: Docs: https://ml-explore.github.io/mlx/build/html/index.html https://github.com/ml-explore/mlx https://ml-explore.github.io/mlx/build/html/index.html https://github.com/ml-explore/mlx"
X Link 2023-12-05T23:45Z 37.8K followers, 901.3K engagements
"We implemented quantization from scratch in a week. I think that is one of the biggest strengths of MLX. Easy to use but also easy to extend and customize. We cant wait to see what people will implement in a month Big update to MLX but especially ๐ฅ N-bit quantization and quantized matmul kernels Thanks to the wizardry of @angeloskath pip install -U mlx https://t.co/YcUhRv31TP Big update to MLX but especially ๐ฅ N-bit quantization and quantized matmul kernels Thanks to the wizardry of @angeloskath pip install -U mlx https://t.co/YcUhRv31TP"
X Link 2023-12-22T05:55Z [----] followers, 27.1K engagements
"Big update to MLX but especially ๐ฅ N-bit quantization and quantized matmul kernels Thanks to the wizardry of @angeloskath pip install -U mlx"
X Link 2023-12-22T02:54Z 37.8K followers, 121.6K engagements
"What I find even cooler than training on an iPhone is that it is done with just [--] lines of code that are super readable and very familiar to anyone that writes training loops in python. Let's go MLX Swift ๐๐๐ Using MLX Swift to train LeNet on MNIST. Takes less than a minute on my iPhone [--]. Example here: https://t.co/lQs6mECoIK @ylecun long-live MNIST https://t.co/T5eSdcDBM3 Using MLX Swift to train LeNet on MNIST. Takes less than a minute on my iPhone [--]. Example here: https://t.co/lQs6mECoIK @ylecun long-live MNIST https://t.co/T5eSdcDBM3"
X Link 2024-03-05T04:51Z [----] followers, 19.3K engagements
"Using MLX Swift to train LeNet on MNIST. Takes less than a minute on my iPhone [--]. Example here: @ylecun long-live MNIST https://github.com/ml-explore/mlx-swift-examples https://github.com/ml-explore/mlx-swift-examples"
X Link 2024-03-05T04:36Z 37.8K followers, 93.8K engagements
"I have to say it because @awnihannun is quick to give credit to others but doesnt take much for himself. This performance improvement largely comes from his relentless hunting down of every kind of overhead in MLX the past weeks. Kudos MLX [----] [----] faster generation across model sizes and machines. tokens-per-second for 4-bit models: https://t.co/o5f3WmVzGZ MLX [----] [----] faster generation across model sizes and machines. tokens-per-second for 4-bit models: https://t.co/o5f3WmVzGZ"
X Link 2024-04-20T19:51Z [----] followers, 17.8K engagements
"MLX [----] [----] faster generation across model sizes and machines. tokens-per-second for 4-bit models:"
X Link 2024-04-20T18:59Z 37.8K followers, 29.2K engagements
"It has been some time since we added distributed support to MLX. Even via 10Gbps Ethernet distributed LoRA scales perfectly. I can't wait for people to train large adapters via thunderbolt or small ones over Wi-Fi. There is so much compute lying around (nodes are M2 Ultras)"
X Link 2024-10-31T21:15Z [----] followers, [----] engagements
"If you are at #ICML2024 last chance today to play with MLX on the M2 Ultra or iPad at the Apple booth If you are a fan already come tell us the cool things you built"
X Link 2024-07-24T08:10Z [----] followers, [----] engagements
"Looking back at all the amazing things people built with MLX in a couple of months I am incredibly excited to see the things that will be built now in a familiar dev environment in Swift Just [--] lines of code to write a general multi-head attention in MLX Swift ๐๐๐ As part of our goal to make MLX a great research tool we're expanding support to new languages like Swift and C making experimentation on Apple silicon easier for ML researchers. Video generating text with Mistral 7B and MLX Swift ๐ MLX is an array framework for machine https://t.co/nKJcgqePyr As part of our goal to make MLX a"
X Link 2024-02-20T22:20Z [----] followers, 11.7K engagements
"As part of our goal to make MLX a great research tool we're expanding support to new languages like Swift and C making experimentation on Apple silicon easier for ML researchers. Video generating text with Mistral 7B and MLX Swift ๐ MLX is an array framework for machine learning research on Apple silicon. MLX is intended for research and not for production deployment of models in apps"
X Link 2024-02-20T20:02Z 37.8K followers, 89K engagements
"In the new MLX release we break the 110tps barrier out of the box for Mistral 7B Here is the speed (before and after) of a bunch of LMs writing a story about Einstein on an M2 Ultra"
X Link 2024-10-18T20:01Z [----] followers, [----] engagements
"Code is also available If you want to experiment with clustered attention all you need to do is pip install pytorch-fast-transformers and then use attention_type="improved-clustered". Enjoy One paper accepted at @NeurIPSConf with @apoorv2904 and @angeloskath on speeding up attention by clustering the queries. The nice thing is that this can be used for inference with standard pre-trained models. https://t.co/lSK2DgGiQC @Idiap_ch @unige_en @EPFL_en @snsf_ch One paper accepted at @NeurIPSConf with @apoorv2904 and @angeloskath on speeding up attention by clustering the queries. The nice thing is"
X Link 2020-09-26T11:15Z [----] followers, [--] engagements
"One paper accepted at @NeurIPSConf with @apoorv2904 and @angeloskath on speeding up attention by clustering the queries. The nice thing is that this can be used for inference with standard pre-trained models. @Idiap_ch @unige_en @EPFL_en @snsf_ch https://arxiv.org/abs/2007.04825 https://arxiv.org/abs/2007.04825"
X Link 2020-09-26T08:45Z 48.7K followers, [---] engagements
"How about your personal chat GPT on your M2 Ultra Amazing model by Mistral AI and [--] day to implement it in MLX. Mixtral 8x7B in MLX https://t.co/wh4PmlQltK Runs on an M2 Ultra ๐ข๐ข https://t.co/AFeEKyvmUu Mixtral 8x7B in MLX https://t.co/wh4PmlQltK Runs on an M2 Ultra ๐ข๐ข https://t.co/AFeEKyvmUu"
X Link 2023-12-12T19:16Z [----] followers, [----] engagements
"Mixtral 8x7B in MLX Runs on an M2 Ultra ๐ข๐ข https://github.com/ml-explore/mlx-examples/tree/main/mixtral https://github.com/ml-explore/mlx-examples/tree/main/mixtral"
X Link 2023-12-12T18:49Z 37.8K followers, 240.4K engagements
"I assembled the @NeurIPSConf [----] accepted papers in a list that is easy to filter by author name affiliation and paper title. Which company do you think has the most first author papers https://angeloskath.github.io/neurips-2020-accepted-papers.html https://angeloskath.github.io/neurips-2020-accepted-papers.html"
X Link 2020-10-06T23:27Z [----] followers, [--] engagements
"@GoogleAI @Pablogomez3 For the "few" of us that don't use JAX yet you can now experiment with FAVOR+ (and other Fourier features) in @PyTorch using our fast-transformers library with just [--] lines of code. Code: Docs: http://fast-transformers.github.io/feature_maps/ http://github.com/idiap/fast-transformers http://fast-transformers.github.io/feature_maps/ http://github.com/idiap/fast-transformers"
X Link 2020-10-23T21:47Z [----] followers, [--] engagements
"For the native Greek speakers you can already interact with Meltemi on your laptop directly from HF using MLX. I also uploaded a quantized 4-bit version on mlx-community for faster inference. Almost [--] tokens per second on a MacBook Air and [--] on an M2 Ultra https://t.co/5QagPC5fjx https://t.co/5QagPC5fjx"
X Link 2024-03-26T22:38Z [----] followers, [----] engagements
"https://medium.com/institute-for-language-and-speech-processing/meltemi-a-large-language-model-for-greek-9f5ef1d4a10f https://medium.com/institute-for-language-and-speech-processing/meltemi-a-large-language-model-for-greek-9f5ef1d4a10f"
X Link 2024-03-26T16:23Z [---] followers, 10.9K engagements
"Kudos to David Koski From discussion to merged PR in [--] days. You can't beat open source We heard developer feedback on Monday that the MLX Swift LLM API is hard to get started with. So we went to work. We made a new and improved streamlined API. Now [--] lines toload an LLM or VLM and start a chat session in a Swift project. Happy last day of #WWDC25 https://t.co/NA00RJqjVR We heard developer feedback on Monday that the MLX Swift LLM API is hard to get started with. So we went to work. We made a new and improved streamlined API. Now [--] lines toload an LLM or VLM and start a chat session in a"
X Link 2025-06-13T18:39Z [----] followers, [----] engagements
"We heard developer feedback on Monday that the MLX Swift LLM API is hard to get started with. So we went to work. We made a new and improved streamlined API. Now [--] lines toload an LLM or VLM and start a chat session in a Swift project. Happy last day of #WWDC25"
X Link 2025-06-13T17:01Z 37.8K followers, 49.9K engagements
"My favorite new addition to MLX in v0.14 is the option to just-in-time compile the kernels in order to create a small binary to ease deployment More than 10x reduction to the Metal library size. https://ml-explore.github.io/mlx/build/html/install.html#binary-size-minimization pip install -U mlx https://t.co/CJYcIRJr9J https://ml-explore.github.io/mlx/build/html/install.html#binary-size-minimization pip install -U mlx https://t.co/CJYcIRJr9J"
X Link 2024-05-24T17:46Z [----] followers, [----] engagements
"pip install -U mlx"
X Link 2024-05-24T02:45Z 37.8K followers, 17.9K engagements
"Now merged with a small README to help people get started. Enjoy Clear your weekend for some FLUX LoRA fine-tuning in MLX thanks to @angeloskath https://t.co/PeM8vnHoZt Clear your weekend for some FLUX LoRA fine-tuning in MLX thanks to @angeloskath https://t.co/PeM8vnHoZt"
X Link 2024-10-12T04:25Z [----] followers, [----] engagements
"Clear your weekend for some FLUX LoRA fine-tuning in MLX thanks to @angeloskath https://github.com/ml-explore/mlx-examples/pull/1028 https://github.com/ml-explore/mlx-examples/pull/1028"
X Link 2024-10-11T02:55Z 37.8K followers, 19.9K engagements
"I feel very lucky to have been at Idiap it is a great place to pursue a PhD. I would also like to thank @francoisfleuret. I couldn't have asked for a better PhD advisor Idiaper wins @EPFL's EEDE Thesis Award ๐ Former #PhD from our institute @angeloskath has received EPFL's Electrical Engineering Doctoral program (#EEDE) Thesis Award for his outstanding research on the efficiency of #DeepLearning models. https://t.co/LO8wgKtc4U Idiaper wins @EPFL's EEDE Thesis Award ๐ Former #PhD from our institute @angeloskath has received EPFL's Electrical Engineering Doctoral program (#EEDE) Thesis Award"
X Link 2022-11-24T18:48Z [----] followers, [--] engagements
"Idiaper wins @EPFL's EEDE Thesis Award ๐ Former #PhD from our institute @angeloskath has received EPFL's Electrical Engineering Doctoral program (#EEDE) Thesis Award for his outstanding research on the efficiency of #DeepLearning models. https://www.idiap.ch/en/allnews/idiaper-wins-epfls-eede-thesis-award https://www.idiap.ch/en/allnews/idiaper-wins-epfls-eede-thesis-award"
X Link 2022-11-24T14:32Z [----] followers, [--] engagements
"It is quite easy to underestimate how hard that is to do in any other framework Lazy evaluation in MLX is a very cool feature but takes some getting used to. Here's one example where it is useful: for low RAM devices lazy loading/computation makes it almost trivial to stream a model from disk without ever materializing the full thing. https://t.co/nodT6gLYza Lazy evaluation in MLX is a very cool feature but takes some getting used to. Here's one example where it is useful: for low RAM devices lazy loading/computation makes it almost trivial to stream a model from disk without ever"
X Link 2024-09-05T21:35Z [----] followers, [----] engagements
"Lazy evaluation in MLX is a very cool feature but takes some getting used to. Here's one example where it is useful: for low RAM devices lazy loading/computation makes it almost trivial to stream a model from disk without ever materializing the full thing"
X Link 2024-09-05T21:27Z 37.8K followers, [----] engagements
"Thank you Yannic for the amazing video. The topic modeling intuition is a very interesting way to think about it and I hadn't thought of the kernels this way. Anybody that doesn't follow Yannic is seriously missing out Check out his channel https://www.youtube.com/c/YannicKilcher New Video ๐ฅ No more O(N2) complexity in Transformers: Kernels to the rescue ๐ฅณ This paper makes Attention linear AND shows an intriguing connection between Transformers and RNNs ๐ช https://t.co/jtFECITlpD @angeloskath @apoorv2904 @nik0spapp @francoisfleuret @EPFL_en @Idiap_ch https://t.co/QDxwkNM3jU"
X Link 2020-07-04T13:36Z [----] followers, [--] engagements
"New Video ๐ฅ No more O(N2) complexity in Transformers: Kernels to the rescue ๐ฅณ This paper makes Attention linear AND shows an intriguing connection between Transformers and RNNs ๐ช @angeloskath @apoorv2904 @nik0spapp @francoisfleuret @EPFL_en @Idiap_ch https://youtu.be/hAooAOFRsYc https://youtu.be/hAooAOFRsYc"
X Link 2020-07-04T12:46Z 84.2K followers, [---] engagements
"This is too cool. Now let's combine it with a TTS model and have it tell us nice stories while looking at the beautiful lake. Apple MLX on Vision Pro YES YOU CAN BOOM Here the raw video of MLX Swift LLMEval example running natively on the device Thanks @awnihannun ๐ ๐ฅ๐ฅ๐ฅ #VisionPro #LLM #Apple https://t.co/35T960KopQ Apple MLX on Vision Pro YES YOU CAN BOOM Here the raw video of MLX Swift LLMEval example running natively on the device Thanks @awnihannun ๐ ๐ฅ๐ฅ๐ฅ #VisionPro #LLM #Apple https://t.co/35T960KopQ"
X Link 2024-03-20T18:21Z [----] followers, [----] engagements
"Apple MLX on Vision Pro YES YOU CAN BOOM Here the raw video of MLX Swift LLMEval example running natively on the device Thanks @awnihannun ๐ ๐ฅ๐ฅ๐ฅ #VisionPro #LLM #Apple"
X Link 2024-03-20T17:54Z 20.4K followers, 40.2K engagements
"What a game #PameStefane #RolandGarros #Tsitsipas"
X Link 2021-06-13T15:12Z [----] followers, [--] engagements
"Because you haven't really released code until you release the documentation. I just finished the first version of docs for our ICML2019 paper You can find it at Oh also you can just pip install attention-sampling . https://x.com/francoisfleuret/status/1126813878812323841 http://attention-sampling.com/ And here it is on @arxiv https://t.co/YK9y6YxWBT TL;DR: A network computes an attention map on a downscaled image and another processes locations sampled according to that map. The pair can be trained end-to-end. https://x.com/francoisfleuret/status/1126813878812323841"
X Link 2019-07-22T20:31Z [----] followers, [--] engagements
"And here it is on @arxiv TL;DR: A network computes an attention map on a downscaled image and another processes locations sampled according to that map. The pair can be trained end-to-end. https://arxiv.org/abs/1905.03711 https://arxiv.org/abs/1905.03711"
X Link 2019-05-10T11:38Z 48.7K followers, [---] engagements
"@DiganiJagrit has made awesome building blocks to write matmul kernels in MLX. It would be so much more work to write this otherwise and it would also be slower. Latest mlx + mlx-lm have much faster prompt processing speeds for MoEs. Thanks to some magic from @angeloskath pip install -U mlx-lm Deep Seek v3 up to 2x faster Mixtral up to 3.5x faster Llama [--] up to 2x faster Latest mlx + mlx-lm have much faster prompt processing speeds for MoEs. Thanks to some magic from @angeloskath pip install -U mlx-lm Deep Seek v3 up to 2x faster Mixtral up to 3.5x faster Llama [--] up to 2x faster"
X Link 2025-04-18T21:18Z [----] followers, [----] engagements
"Latest mlx + mlx-lm have much faster prompt processing speeds for MoEs. Thanks to some magic from @angeloskath pip install -U mlx-lm Deep Seek v3 up to 2x faster Mixtral up to 3.5x faster Llama [--] up to 2x faster"
X Link 2025-04-18T20:09Z 37.8K followers, 17.3K engagements
"Love this I think the barrier to entry for playing with LLMs doesnt get any lower Made a minimal but fast example of text generation with Llama [---] in MLX. Minimal: [--] file [---] lines of simple code [--] dependencies Fast: 100+ toks/sec with 4-bit 8B on M2 Ultra Code: https://t.co/aCfcEG7geA https://t.co/mw9PhLilBn Made a minimal but fast example of text generation with Llama [---] in MLX. Minimal: [--] file [---] lines of simple code [--] dependencies Fast: 100+ toks/sec with 4-bit 8B on M2 Ultra Code: https://t.co/aCfcEG7geA https://t.co/mw9PhLilBn"
X Link 2024-08-09T18:43Z [----] followers, [----] engagements
"Made a minimal but fast example of text generation with Llama [---] in MLX. Minimal: [--] file [---] lines of simple code [--] dependencies Fast: 100+ toks/sec with 4-bit 8B on M2 Ultra Code: https://gist.github.com/awni/cf42588b8c084c3d93d7373b604c7f9c https://gist.github.com/awni/cf42588b8c084c3d93d7373b604c7f9c"
X Link 2024-08-09T14:50Z 37.8K followers, 13.1K engagements
"I assembled the @icmlconf [----] accepted papers in a list that is easy to filter based for instance on affiliations or title. First authors from: Google [--] Microsoft [--] Facebook [--] Amazon [--] Apple [--] @EPFL_en [--] @ETH [--] #ICML2019 http://idiap.ch/katharas/pages/accepted-papers-at-icml-2019.html http://idiap.ch/katharas/pages/accepted-papers-at-icml-2019.html"
X Link 2019-05-11T13:38Z [----] followers, [--] engagements
"@unixpickle @awnihannun Unified memory is the big one. The fast Metal kernels and linking to accelerate or Apple specific SIMD instructions would be another one. We are very excited to explore what new architecture the above will enable or the impact to the existing ones"
X Link 2023-12-06T02:54Z [----] followers, [----] engagements
"ICCV reviewer invitation expires 2/1/2021 . now does that mean I missed it or that when addressing an international crowd the US date notation is very confusing"
X Link 2021-01-11T22:22Z [----] followers, [--] engagements
"I know which model I am uploading to MLX community today ๐ https://t.co/5QagPC5fjx https://t.co/5QagPC5fjx"
X Link 2024-03-26T17:36Z [----] followers, [----] engagements
"Did you know that clustered attention approximates a pretrained wav2vec on librispeech two times better than Performer's FAVOR Come talk to us at our #NeurIPS2020 poster in [--] hours to find out more With @angeloskath and @francoisfleuret we will present our work on fast transformers with clustering at #NeurIPS2020 on Thu @ 18:00 CET. Please visit our poster to know more. We will also answer questions on chat. Poster: https://t.co/o4eaP8n69P Project: https://t.co/QTekQFBq0k With @angeloskath and @francoisfleuret we will present our work on fast transformers with clustering at #NeurIPS2020 on"
X Link 2020-12-10T14:58Z [----] followers, [--] engagements
"With @angeloskath and @francoisfleuret we will present our work on fast transformers with clustering at #NeurIPS2020 on Thu @ 18:00 CET. Please visit our poster to know more. We will also answer questions on chat. Poster: Project: http://clustered-transformers.github.io/ http://neurips.cc/virtual/2020/protected/poster_f6a8dd1c954c8506aadc764cc32b895e.html http://clustered-transformers.github.io/ http://neurips.cc/virtual/2020/protected/poster_f6a8dd1c954c8506aadc764cc32b895e.html"
X Link 2020-12-08T22:01Z [---] followers, [--] engagements
"An interesting take-away from this work is that sequence-level routing works in real world tasks like question answering by routing based on a short prefix. This enables practically communication-free training of a traditional mixture of experts model and cheap inference. Dont have fast interconnect No problem No need for nodes to talk We introduce SMALLTALK LM an innovative method for training a mixture of language models (almost) asynchronously. SMALLTALK LM achieves better perplexity and better accuracy on a majority of downstream tasks https://t.co/ZWn0eQxzEe Dont have fast interconnect"
X Link 2024-10-11T06:29Z [----] followers, [----] engagements
"Dont have fast interconnect No problem No need for nodes to talk We introduce SMALLTALK LM an innovative method for training a mixture of language models (almost) asynchronously. SMALLTALK LM achieves better perplexity and better accuracy on a majority of downstream tasks compared to a regular dense language model trained with the same amount of FLOPs. Paper: A joint work with amazing @angeloskath @GrangierDavid and Ronan Collobert. Let me explain how we did it -๐งต: 1/ https://arxiv.org/pdf/2410.03529 https://arxiv.org/pdf/2410.03529"
X Link 2024-10-11T02:35Z [---] followers, [----] engagements
"I wish CMT had a negative tweet limit. Basically if your review fits in a tweet you shouldn't be able to submit it. #cvpr2020 #cvpr"
X Link 2020-02-01T19:20Z [----] followers, [--] engagements
"To reproduce the video above first pip install -U mlx_lm and then python -m mlx_lm.generate --model mlx-community/ilsp-Meltemi-7B-Instruct-v1-4bit --prompt " ." --temp [---] --max-tokens [----] on any M-series Mac"
X Link 2024-03-26T22:38Z [----] followers, [---] engagements
"Awesome work by a friend in @Oxford_VGG Watch people fighting on TV (we all like that right) without missing a single thing anybody says. Related publications: http://www.robots.ox.ac.uk/vgg/publications/2018/Afouras18b/afouras18b.pdf http://www.robots.ox.ac.uk/vgg/publications/2018/Afouras18/afouras18.pdf Can #AI modelling help people with hearing difficulties Discover how #OxfordAI could assist those with hearing difficulties by isolating voices in noisy environments: https://t.co/AmrE7QIoqw https://t.co/nzJAEuzKf9 http://www.robots.ox.ac.uk/vgg/publications/2018/Afouras18b/afouras18b.pdf"
X Link 2018-11-04T17:17Z [----] followers, [--] engagements
"Can #AI modelling help people with hearing difficulties Discover how #OxfordAI could assist those with hearing difficulties by isolating voices in noisy environments: http://po.st/OxfordAI http://po.st/OxfordAI"
X Link 2018-11-03T08:11Z 1M followers, [---] engagements
"What started in May is finalized in Greece's national elections yesterday. The far-right neo-fascist party did not make it in the greek parliament Hopefully the rest of Europe will follow. #ekloges19 #greekelections2019 #Europe https://x.com/angeloskath/status/1133354726530146306s=19 The definition of mixed feelings: When the far-right party of your country loses half their votes in [--] years and at the same time they will have [--] representatives in the european parliament because 4.9% is still too much. #EuropeanElectionResults #EUelections2019"
X Link 2019-07-08T11:10Z [----] followers, [--] engagements
"The definition of mixed feelings: When the far-right party of your country loses half their votes in [--] years and at the same time they will have [--] representatives in the european parliament because 4.9% is still too much. #EuropeanElectionResults #EUelections2019"
X Link 2019-05-28T12:49Z [----] followers, [--] engagements
"@francoisfleuret But we established that it cannot do this though"
X Link 2020-03-16T08:59Z [----] followers, [--] engagements
"@Prince_Canuma There is but they need a bit of a refresh. I will see if it makes sense to make a small launch helper script that will prevent a bunch of "foot-guns" related to launching MPI jobs. https://ml-explore.github.io/mlx/build/html/usage/distributed.html https://ml-explore.github.io/mlx/build/html/usage/distributed.html"
X Link 2024-10-31T22:57Z [----] followers, [---] engagements
"When we finished developing "Transformers are RNNs" we had planned to showcase it using music generation. We ended up not investing the necessary time but today I came across "Compound Word Transformer" and I love the generated music. Check it out https://ailabs.tw/human-interaction/compound-word-transformer-generate-pop-piano-music-of-full-song-length/ https://ailabs.tw/human-interaction/compound-word-transformer-generate-pop-piano-music-of-full-song-length/"
X Link 2021-01-12T14:08Z [----] followers, [--] engagements
"This is very cool It has also been a few times I have wanted to shoutout at . @zcbenz has done an awesome job with the MLX bindings to Node.js. They are quite enjoyable to read with the proper amount of JS specific add-ons or quirks. https://github.com/frost-beta/node-mlx I wrote a CLI tool for semantic images search using Node.js and MLX. No third party API is used everything runs locally. Index is built of image embeddings with CLIP model and searching is just computing cosine similarities. https://t.co/QNuwkKX1x6 https://t.co/YNNgmetG9k https://github.com/frost-beta/node-mlx I wrote a CLI"
X Link 2024-09-16T16:47Z [----] followers, [---] engagements
"I wrote a CLI tool for semantic images search using Node.js and MLX. No third party API is used everything runs locally. Index is built of image embeddings with CLIP model and searching is just computing cosine similarities. https://github.com/frost-beta/sisi https://github.com/frost-beta/sisi"
X Link 2024-09-16T10:45Z [----] followers, [----] engagements
"@ruairiSpain This is all coming from the fused attention kernel which we just added. It may even have some space for further optimization in the next few days"
X Link 2024-10-18T20:56Z [----] followers, [---] engagements
"Switzerland is not closing schools for #COVID19 because it would endanger grandparents who would take care of the children. Greece on the other hand pays for the vacation days of one of the two parents and closes all schools for [--] days. Switzerland man-up"
X Link 2020-03-11T17:32Z [----] followers, [--] engagements
"Arxiv and code coming soon. One paper accepted at #ICML2019 with @angeloskath on attention-sampling with deep architectures to process megapixel images. One paper accepted at #ICML2019 with @angeloskath on attention-sampling with deep architectures to process megapixel images"
X Link 2019-04-22T13:56Z [----] followers, [--] engagements
"One paper accepted at #ICML2019 with @angeloskath on attention-sampling with deep architectures to process megapixel images"
X Link 2019-04-22T12:27Z 48.7K followers, [--] engagements
"@CVPRConf website is down but @paschalidoud_1 is also analog #CVPR2020"
X Link 2020-06-16T17:16Z [----] followers, [--] engagements
"It is rare to see such a thorough and complete analysis nowadays. [--] awesome pages of appendix Any comparison/ablation you didn't know you wanted is probably there already. It was a privilege to see it being developed by Matteo David and Pierre. Stop discarding your old gradients Introducing AdEMAMix a novel (first-order) optimizer capable of outperforming Adam. Lets have a thread on momentum and the surprising relevance of very old gradients. A joint work with @GrangierDavid and @PierreAblin #ml #optimization 1/๐งต https://t.co/MbGVcSIPdg Stop discarding your old gradients Introducing"
X Link 2024-09-06T22:11Z [----] followers, [----] engagements
"Stop discarding your old gradients Introducing AdEMAMix a novel (first-order) optimizer capable of outperforming Adam. Lets have a thread on momentum and the surprising relevance of very old gradients. A joint work with @GrangierDavid and @PierreAblin #ml #optimization 1/๐งต"
X Link 2024-09-06T17:25Z [---] followers, 67.1K engagements
"Effort led by @awnihannun and I have to say that readability and development did not suffer at all to enable this new capability. If anything kernel instantiation (using the preprocessor) and kernel definition are now nicely separated"
X Link 2024-05-24T17:46Z [----] followers, [---] engagements
"@lucidrains @apoorv2904 @SmallerNNsPls @francoisfleuret @trees_random @icmlconf @nik0spapp @Idiap_ch @EPFL Thanks for the interest Indeed. However the main benefit of our work is the derivation of a formulation that allows to write an autoregressive transformer as an RNN; thus resulting in orders of magnitude speed up during inference. (we really need to speed up the preprint :-))"
X Link 2020-06-02T22:03Z [----] followers, [--] engagements
"Congrats to all the researchers from ILSP and Athena research center that worked on this I couldn't find twitter handles to tag people so please let me know if I should be tagging someone"
X Link 2024-03-26T22:38Z [----] followers, [---] engagements
"@caviterginsoy @ivanfioravanti @alex_barron1 @DiganiJagrit Neglected a bit for sure but what about ๐ https://github.com/ml-explore/mlx-data https://github.com/ml-explore/mlx-data"
X Link 2024-11-22T21:44Z [----] followers, [--] engagements
"@SmallerNNsPls @francoisfleuret @trees_random @icmlconf @apoorv2904 @nik0spapp @Idiap_ch @EPFL Yes they are normalized as follows (Q) (K)' V / (sum_i (Q) (K)_i). You have to assume some broadcasting semantics in the above equation due to twitter"
X Link 2020-06-01T23:03Z [----] followers, [--] engagements
"@francoisfleuret Easy. Woodworker or blacksmith or both. Making tools to make tools to make tools is still one of the big joys of life"
X Link 2020-04-03T18:39Z [----] followers, [--] engagements
"So. @github you implement code search but decide to ignore . : ; / ' " = * # $ & + ( ) I am having fun searching for function definitions/implementations without being able to use "func(" or "::func""
X Link 2019-08-07T13:09Z [----] followers, [--] engagements
"@ykilcher @_florianmai @jiangelaa @zacharylipton @francoisfleuret If you are looking for an intuitive explanation regarding why these methods don't help much on hard datasets (the question raised in the video) they rely on the existence of uninformative datapoints. In Imagenet there are none for most of the training"
X Link 2019-10-07T16:22Z [----] followers, [--] engagements
"Oh and the model definition looks even more familiar"
X Link 2024-03-05T04:51Z [----] followers, [---] engagements
"@demirbasayyuce @awnihannun Well actually I dont think you need any of that due to unified memory. Quantizing the Lora example in mlx should work out of the box. Havent tried it yet but I dont see why not"
X Link 2023-12-22T01:28Z [----] followers, [---] engagements
"Usually I adore @PyTorch software engineering but going from v1.5.0 to v1.6.0 breaks at::detail::getDefaultCPUGenerator() which breaks some C++ extensions. Shouldn't that be in the release notes"
X Link 2020-08-26T17:24Z [----] followers, [--] engagements
"@unixpickle Our convs (especially the backward) do need some love. We are working on it and they will be faster soon ๐"
X Link 2024-11-22T00:11Z [----] followers, [---] engagements
"@Prince_Canuma Here is the PR It's been there months ๐
we 'll merge it shortly. TL;DR: The mlx_lm.lora supports distributed finetuning. All you have to do is launch it with mpirun . https://github.com/ml-explore/mlx-examples/pull/821 https://github.com/ml-explore/mlx-examples/pull/821"
X Link 2024-10-31T23:21Z [----] followers, [---] engagements
"@unixpickle Indeed but isnt this like 90% of fine-tuning runs Ymmv but for a 14B even 200M parameters LoRA scales very reasonably via Ethernet. Something like 5x speed up for [--] nodes"
X Link 2024-11-01T04:48Z [----] followers, [--] engagements
"@Prince_Canuma Each machine has the full model ie this is simple data parallel training. Given that the Macs have comparatively very high memory capacity per FLOP ratio I think it is also the best strategy. Combined with grad accumulation and checkpointing you can fine-tune almost anything"
X Link 2024-10-31T23:38Z [----] followers, [---] engagements
"At these speeds who needs RAG am I right In the latest MLX generating with long prompts is much faster with KV quantization. Thanks to @alex_barron1. 4-bit Llama 8B with a [-----] token prompt + 8-bit KV cache generates at [--] toks/sec on an M2 Ultra: https://t.co/TlBZTOhpfD In the latest MLX generating with long prompts is much faster with KV quantization. Thanks to @alex_barron1. 4-bit Llama 8B with a [-----] token prompt + 8-bit KV cache generates at [--] toks/sec on an M2 Ultra: https://t.co/TlBZTOhpfD"
X Link 2024-11-06T22:05Z [----] followers, [---] engagements
"In the latest MLX generating with long prompts is much faster with KV quantization. Thanks to @alex_barron1. 4-bit Llama 8B with a [-----] token prompt + 8-bit KV cache generates at [--] toks/sec on an M2 Ultra:"
X Link 2024-11-06T16:14Z 37.8K followers, [----] engagements
"@CSProfKGD @CVPR This is great Could not think of anybody better for this :-)"
X Link 2021-04-12T09:05Z [----] followers, [--] engagements
"@KassinosS @awnihannun Out of curiosity how would a simple relu MLP that passes the inputs through a simple sinusoidal positional encoding do in that problem In my experience they are a pretty good baseline for any such function approximation. See for examples of what I mean. https://bmild.github.io/fourfeat/ https://bmild.github.io/fourfeat/"
X Link 2024-01-23T10:17Z [----] followers, [---] engagements
"Yay Our awesome group is growing I have two open phd positions in my group at @Idiap_ch / @EPFL_en Both in deep learning one in computer vision to combine multi-sensors for scene reconstruction and the other for weather forecast and air traffic control. https://t.co/wtRcjpc4Vl I have two open phd positions in my group at @Idiap_ch / @EPFL_en Both in deep learning one in computer vision to combine multi-sensors for scene reconstruction and the other for weather forecast and air traffic control. https://t.co/wtRcjpc4Vl"
X Link 2019-07-20T06:32Z [----] followers, [--] engagements
"I have two open phd positions in my group at @Idiap_ch / @EPFL_en Both in deep learning one in computer vision to combine multi-sensors for scene reconstruction and the other for weather forecast and air traffic control. https://www.idiap.ch/fleuret/hiring.html https://www.idiap.ch/fleuret/hiring.html"
X Link 2019-07-19T16:12Z 48.7K followers, [---] engagements
"@WankyuChoi I am super happy you picked it up ๐. I actually added it to the example after seeing your previous demo and comments. Great video as always"
X Link 2023-12-30T22:27Z [----] followers, [---] engagements
"@francoisfleuret Advice for people looking for a career: learn software engineering"
X Link 2020-03-04T07:44Z [----] followers, [--] engagements
"@ivanfioravanti @emrekoctw @awnihannun The UNet and text encoders should be fine as they only need about 4GB when quantized. The decoder otoh needs more. The trick there is to apply the decoder in a tiling fashion but I am not 100% sure it will be straightforward"
X Link 2024-03-10T16:03Z [----] followers, [---] engagements
"@priontific This is possible in all frameworks. The only thing that changes is simplicity and efficiency. Eg I am pretty sure that in PyTorch one would have to manually unload layers mid computation. Simply put with MLX the efficiency is great and I dont think it gets any simpler"
X Link 2024-09-06T07:35Z [----] followers, [--] engagements
"@ivanfioravanti Looking at that you should probably also increase the batch size. It will help amortize the communication cost. Here each machine deals with [--] sequences I would do [--] per machine meaning [--] total"
X Link 2024-11-03T21:23Z [----] followers, [--] engagements
"Fantastic work by @NasFilippova before even starting her PhD ๐"
X Link 2024-10-11T06:29Z [----] followers, [---] engagements
"@dimadamen @ducha_aiki Oops sorry if it was perceived as whining mostly meant as a joke ๐. Thanks a lot for the reply and taking it into account for the future"
X Link 2021-01-12T09:07Z [----] followers, [--] engagements
"@unixpickle @gazorp5 @awnihannun It would be quite an architectural change I believe to have unified memory in either of the two. It is not as simple as making a backend since the operations need to synchronize but not copy even though they may run on GPU or CPU"
X Link 2023-12-06T03:48Z [----] followers, [---] engagements
"@awnihannun @priontific It is also the schnell model which is significantly harder to fine tune"
X Link 2024-10-12T19:29Z [----] followers, [---] engagements
"@unixpickle @gazorp5 @awnihannun Moreover designing a backend would mean we inherit all the negative aspects of these frameworks whether they are shape based compilation or eager computation or something else"
X Link 2023-12-06T03:48Z [----] followers, [---] engagements
"@ducha_aiki @francoisfleuret @apoorv2904 @nik0spapp @jb_cordonnier Well not instead of self-attention but you could look at that uses a similar mechanism with completely data independent values to replace fully connected layers. https://arxiv.org/abs/1907.05242 https://arxiv.org/abs/1907.05242"
X Link 2020-07-13T11:32Z [----] followers, [--] engagements
"@lucidrains @apoorv2904 @SmallerNNsPls @francoisfleuret @trees_random @icmlconf @nik0spapp @Idiap_ch @EPFL In pseudocode yes. In practice this requires N times more memory than necessary so we opt for a custom CUDA kernel. During inference this is kept as the state so one only needs the last value of the cumsum anyway (so no custom kernels necessary)"
X Link 2020-06-02T22:41Z [----] followers, [--] engagements
"@Prince_Canuma Yeah of course you can implement it. But fyi pipeline training is significantly more complicated than pipeline inference (and harder to get linear scaling as well)"
X Link 2024-11-01T00:32Z [----] followers, [--] engagements
"@andriy_mulyar @_joaogui1 @pragmaticml Besides the custom kernels I think the jax implementation of linear attention is a bit off. In theory it should be identical to performers without the feature map so at least as fast. In our implementation it is 2-3 times faster than FAVOR with [---] dims"
X Link 2020-11-10T04:47Z [----] followers, [--] engagements
"@Suuraj @francoisfleuret @pafrossard @AlexAlahi @_beenkim @LudovicDenoyer Congratulations to both you guys Well deserved ๐ฅณ๐ฅณ๐ฅณ"
X Link 2021-12-18T10:29Z [----] followers, [--] engagements
"@ivanfioravanti Thats exciting ๐. Ethernet thunderbolt or WiFi"
X Link 2024-11-03T19:51Z [----] followers, [--] engagements
"@dave_andersen @ykilcher @_florianmai @jiangelaa @zacharylipton @francoisfleuret It's my bad for posting it without more context. It is the empirical variance of the mini-batch gradient under different sampling distributions. Namely we sample mini-batches compute the grad and compare the norm of the diff with the average gradient"
X Link 2019-10-10T23:02Z [----] followers, [--] engagements
"@priontific @MePavelKral @awnihannun Hmm try to not generate progress images it saves some memory (it shouldn't but it does will look into it). For 512x512 batch size [--] I am getting 49GB so it should be doable without swap. Agreed though QLoRA (and/or checkpointing) is gonna make a huge difference"
X Link 2024-10-12T02:25Z [----] followers, [---] engagements
"@walkfourmore You can fine tune it using LoRA on your laptop (see the MLX examples). An 8GB MacBook Air wont break any speed records but you can easily fine tune it on your data over night if they are about a book long"
X Link 2024-03-30T21:43Z [----] followers, [--] engagements
"I know I probably shouldn't be using in my code but keras definitely shouldn't be using 'from tensorflow_backend import *' . http://K.tf http://K.tf"
X Link 2019-08-28T12:01Z [----] followers, [--] engagements
"@MePavelKral @awnihannun Its currently around 50GB but there are many things left to do to get this down to less than [--]. More PRs coming โบ"
X Link 2024-10-11T19:00Z [----] followers, [--] engagements
"@chriswolfvision @francoisfleuret Well I think broadcasting is great The problem is with implicit expand_dims. Who thought that it was a good idea to implicitly resize tensors so that the dims work Under that reasoning all element wise operations are possible by expanding enough times both tensors"
X Link 2021-01-05T17:19Z [----] followers, [--] engagements
"@Prince_Canuma Indeed. It is also the case for flux finetuning btw"
X Link 2024-10-31T23:30Z [----] followers, [--] engagements
"Removing a public member from a python module is a backwards incompatible change and should incur a major version change. Looking at you keras.backend . that you no longer provide tf moving from v2.2.4 to v2.2.5 . @fchollet"
X Link 2019-08-28T12:01Z [----] followers, [--] engagements
"@ivanfioravanti Hm feel free to drop an issue if you think that there is something to be fixed"
X Link 2025-04-04T15:57Z [----] followers, [---] engagements
"@dave_andersen @ykilcher @_florianmai @jiangelaa @zacharylipton @francoisfleuret Specifically if we consider the gradient norm as an indicator on whether a sample is informative we see that for Imagenet the distribution of the norms is much closer to uniform (hence we cannot reduce the variance as depicted)"
X Link 2019-10-10T22:06Z [----] followers, [--] engagements
Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing
/creator/twitter::angeloskath