#  @UnslothAI Unsloth AI Unsloth AI posts on X about ai, accuracy, vram, agentic the most. They currently have [------] followers and [---] posts still getting attention that total [-------] engagements in the last [--] hours. ### Engagements: [-------] [#](/creator/twitter::1730159888402395136/interactions)  - [--] Week [-------] +49% - [--] Month [---------] +238% - [--] Months [---------] +261% - [--] Year [---------] +426% ### Mentions: [--] [#](/creator/twitter::1730159888402395136/posts_active)  - [--] Week [--] +18% - [--] Month [--] +96% - [--] Months [---] +98% - [--] Year [---] +290% ### Followers: [------] [#](/creator/twitter::1730159888402395136/followers)  - [--] Week [------] +2.80% - [--] Month [------] +12% - [--] Months [------] +46% - [--] Year [------] +180% ### CreatorRank: [-------] [#](/creator/twitter::1730159888402395136/influencer_rank)  ### Social Influence **Social category influence** [technology brands](/list/technology-brands) [stocks](/list/stocks) [vc firms](/list/vc-firms) [finance](/list/finance) [automotive brands](/list/automotive-brands) [celebrities](/list/celebrities) [products](/list/products) **Social topic influence** [ai](/topic/ai) #1763, [accuracy](/topic/accuracy), [vram](/topic/vram) #35, [agentic](/topic/agentic) #9, [open ai](/topic/open-ai), [how to](/topic/how-to), [gpu](/topic/gpu), [inference](/topic/inference) #402, [faster](/topic/faster), [llm](/topic/llm) #10 **Top accounts mentioned or mentioned by** [@huggingface](/creator/undefined) [@alibabaqwen](/creator/undefined) [@danielhanchen](/creator/undefined) [@grok](/creator/undefined) [@nvidia](/creator/undefined) [@deepseekai](/creator/undefined) [@zaiorg](/creator/undefined) [@amdindia](/creator/undefined) [@vipulgupta2048](/creator/undefined) [@foley2k2](/creator/undefined) [@scheminglunatic](/creator/undefined) [@mistralai](/creator/undefined) [@ycombinator](/creator/undefined) [@openai](/creator/undefined) [@pytorch](/creator/undefined) [@nvidiaaidev](/creator/undefined) [@emmanuel_mr18](/creator/undefined) [@agentcommunity_](/creator/undefined) [@rohanpaulai](/creator/undefined) [@kaggle](/creator/undefined) **Top assets mentioned** [Alphabet Inc Class A (GOOGL)](/topic/$googl) [DeepSeek (DEEPSEEK)](/topic/deepseek) [IBM (IBM)](/topic/ibm) [Gains (GAINS)](/topic/gains) [FilesCoins Power Cu (FILECOIN)](/topic/files) [DeepSeek AI Agent (DEEPSEEKAI)](/topic/deepseek-ai) [Microsoft Corp. (MSFT)](/topic/microsoft) [Flex Ltd. Ordinary Shares (FLEX)](/topic/$flex) ### Top Social Posts Top posts by engagements in the last [--] hours "Unsloth now supports fine-tuning of LLMs with 4x longer context windows We managed to reduce memory usage by a further 30% at the cost of +1.9% extra time overhead. Read our blog: http://unsloth.ai/blog/long-context http://unsloth.ai/blog/long-context" [X Link](https://x.com/UnslothAI/status/1777735916309926052) 2024-04-09T16:30Z [----] followers, [----] engagements "This works on all model architectures which use gradient checkpointing (ie stable diffusion Mamba etc) See bar graph for memory saving benchmarks:" [X Link](https://x.com/UnslothAI/status/1777737003037270398) 2024-04-09T16:35Z [----] followers, [---] engagements "Long-context Llama [--] finetuning is here π¦ Unsloth supports 48K context lengths for Llama-3 70b on a 80GB GPU - 6x longer than HF+FA2 QLoRA finetuning Llama-3 70b is 1.8x faster uses 68% less VRAM & Llama-3 8b is 2x faster and fits in a 8GB GPU Blog: https://www.unsloth.ai/blog/llama3 https://www.unsloth.ai/blog/llama3" [X Link](https://x.com/UnslothAI/status/1783200234669236532) 2024-04-24T18:24Z [----] followers, 59.4K engagements "Mistral's new model NeMo (12B) is now supported Unsloth makes finetuning NeMo fit in a 12GB GPU QLoRA training is 2x faster uses 60% less memory & we support 3-4x longer context lengths than HF+FA2. Read our Blog: https://unsloth.ai/blog/mistral-nemo https://unsloth.ai/blog/mistral-nemo" [X Link](https://x.com/UnslothAI/status/1814326379342921734) 2024-07-19T15:48Z [----] followers, 11.6K engagements "@rohanpaul_ai @MistralAI @nvidia @danielhanchen Thank you so much Rohan as always for supporting Unsloth π¦₯ Hope you will like Unsloth Studio (our upcoming UI) which will hopefully be out next week. π₯°" [X Link](https://x.com/UnslothAI/status/1814928428753432662) 2024-07-21T07:40Z [----] followers, [---] engagements "@danielhanchen We have uploaded 4bit bnb quants for now and are working on Llama [---] support Llama [---] (8B) 4bit: Llama [---] (8B) Instruct 4bit: Llama [---] (70B) 4bit: Llama [---] (70B) Instruct 4bit: https://huggingface.co/unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit https://huggingface.co/unsloth/Meta-Llama-3.1-70B-bnb-4bit https://huggingface.co/unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit https://huggingface.co/unsloth/Meta-Llama-3.1-8B-bnb-4bit https://huggingface.co/unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit https://huggingface.co/unsloth/Meta-Llama-3.1-70B-bnb-4bit" [X Link](https://x.com/UnslothAI/status/1815800330216694109) 2024-07-23T17:25Z [----] followers, [----] engagements "Llama [---] support is here Unsloth supports 48K context lengths for Llama [---] (70B) on a 80GB GPU - 6x longer than HF+FA2. QLoRA fine-tuning Llama [---] (70B) is 1.9x faster uses 65% less VRAM & Llama [---] (8B) is 2.1x faster and fits in a 8GB GPU Blog: https://unsloth.ai/blog/llama3-1 https://unsloth.ai/blog/llama3-1 https://unsloth.ai/blog/llama3-1 https://unsloth.ai/blog/llama3-1" [X Link](https://x.com/UnslothAI/status/1815842967573389321) 2024-07-23T20:14Z [----] followers, 11.1K engagements "We just hit [--] million monthly downloads on @HuggingFace π¦₯π₯³ Over 13K models trained with Unsloth have also been uploaded to Hugging Face. Huge thanks to the Unsloth community the model teams and the HF team π€ http://huggingface.co/unsloth http://huggingface.co/unsloth" [X Link](https://x.com/UnslothAI/status/1823014327848460321) 2024-08-12T15:11Z [----] followers, [----] engagements "Were excited to share that Unsloth is now backed by @YCombinator Building on our foundation in open-source fine-tuning were creating the all-in-one solution so you can focus on making the models you've always dreamed of without the complexity. With a focus on accuracy speed and accessibility we use math algorithms low-level languages (Triton CUDA) to innovate the LLM ecosystem through software not hardware. Join our waitlist: Read our roadmap: https://unsloth.ai/roadmap-yc https://unsloth.ai/waitlist https://unsloth.ai/roadmap-yc https://unsloth.ai/waitlist" [X Link](https://x.com/UnslothAI/status/1831715700031025455) 2024-09-05T15:27Z [----] followers, 73K engagements "@rohanpaul_ai @danielhanchen @ycombinator Thank you so much Rohan for the constant support We really really appreciate it π€β₯ And excited to show you some of our new features" [X Link](https://x.com/UnslothAI/status/1831808188343316634) 2024-09-05T21:34Z [----] followers, [--] engagements "@ynktk1 Hi there apologies. We have now enabled pip install unsloth and will also be working on a possible docker. Is there any particular reason why it was hard to install Thank you π€" [X Link](https://x.com/UnslothAI/status/1831845547407503384) 2024-09-06T00:03Z [----] followers, [---] engagements "@LucasAtkins7 @danielhanchen @ycombinator Thank you Lucas for being a supporter from day one π" [X Link](https://x.com/UnslothAI/status/1832221045660725259) 2024-09-07T00:55Z [----] followers, [---] engagements "Llama [---] versions including GGUF's + bnb [--] bit versions + reuploaded versions are now on @HuggingFace See all versions of Llama [---] here: We are actively working on supporting Vision models and 1B and 3B. https://huggingface.co/collections/unsloth/llama-32-66f46afde4ca573864321a22 https://huggingface.co/collections/unsloth/llama-32-66f46afde4ca573864321a22" [X Link](https://x.com/UnslothAI/status/1839036956245897685) 2024-09-25T20:19Z [----] followers, 14.2K engagements "@snapolino @danielhanchen It's from Meta" [X Link](https://x.com/UnslothAI/status/1839123203865842137) 2024-09-26T02:01Z [----] followers, [--] engagements "You can finetune Llama-3.2 for free on Colab now Unsloth makes finetuning 2x faster and uses 60% less VRAM with no accuracy degradation. Llama [---] (1B) QLoRA fits on a 4GB GPU and (3B) fits on 7GB. Vision support coming soon. Finetuning Colab: https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9usp=sharing https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9usp=sharing https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9usp=sharing https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9usp=sharing" [X Link](https://x.com/UnslothAI/status/1839340091241869698) 2024-09-26T16:23Z [----] followers, 102.3K engagements "@TheCoinCollect8 It's definitely possible but yes very slow. I'd recommend using llama.cpp for this" [X Link](https://x.com/UnslothAI/status/1839413474826637547) 2024-09-26T21:15Z [----] followers, [---] engagements "Today were releasing a new method that improves the way everyone trains LLMs. There's a significant bug that causes loss miscalculations during training. Our Gradient Accumulation fix corrects the issue reducing L2 norm error by 10x. Blog details: http://unsloth.ai/blog/gradient http://unsloth.ai/blog/gradient" [X Link](https://x.com/UnslothAI/status/1846231235749990699) 2024-10-15T16:46Z [----] followers, 28K engagements "Join us & @GPU_Mode tomorrow at 3pm ET where we'll talk about our Gradient Accumulation Fix Triton + CUDA kernels & more. Thanks to @MarkSaroufim & @neurosp1ke for inviting us Meeting: https://discord.gg/enps8abKevent=1289330796015915178 https://discord.gg/enps8abKevent=1289330796015915178" [X Link](https://x.com/UnslothAI/status/1847359103271948517) 2024-10-18T19:28Z 32K followers, [----] engagements "You can finetune Qwen-2.5-Coder-14B for free on Colab now Unsloth makes finetuning 2x faster & uses 60% less VRAM with no accuracy loss. We extended context lengths from 32K to 128K with YaRN & uploaded GGUFs: Finetuning Colab: https://colab.research.google.com/drive/18sN803sU23XuJV9Q8On2xgqHSer6-UZFusp=sharing https://huggingface.co/collections/unsloth/qwen-25-coder-all-versions-6732bc833ed65dd1964994d4 https://colab.research.google.com/drive/18sN803sU23XuJV9Q8On2xgqHSer6-UZFusp=sharing https://huggingface.co/collections/unsloth/qwen-25-coder-all-versions-6732bc833ed65dd1964994d4" [X Link](https://x.com/UnslothAI/status/1856424217610465783) 2024-11-12T19:49Z 12.2K followers, 61.7K engagements "You can finetune Llama-3.2-Vision-11B for free on Colab now Unsloth finetunes VLMs 2x faster with 50% less VRAM 6x longer context - with no accuracy loss. Documentation: GitHub: Finetuning Colab: https://colab.research.google.com/drive/1j0N4XTY1zXXy7mPAhOC1_gMYZ2F2EBlkusp=sharing https://github.com/unslothai/unsloth https://docs.unsloth.ai/ https://colab.research.google.com/drive/1j0N4XTY1zXXy7mPAhOC1_gMYZ2F2EBlkusp=sharing https://github.com/unslothai/unsloth https://docs.unsloth.ai/ https://colab.research.google.com/drive/1j0N4XTY1zXXy7mPAhOC1_gMYZ2F2EBlkusp=sharing" [X Link](https://x.com/UnslothAI/status/1859667930075758793) 2024-11-21T18:39Z [----] followers, 69.7K engagements "@TuanPham672604 @_xjdr @IlyasHairline @shah_bu_land @cloneofsimo This was actually a known issue for a long time. We already fixed this issue back in February when Gemma got released and worked with Hugging Face to implement the fixes. See here for more info: https://unsloth.ai/blog/gemma-bugs https://unsloth.ai/blog/gemma-bugs" [X Link](https://x.com/UnslothAI/status/1862356980314382836) 2024-11-29T04:44Z [----] followers, [--] engagements "Were excited to introduce Unsloth Dynamic 4-bit Quantization Naive quantization often hurts accuracy making models unusable but we dynamically opt not to quantize certain parameters. Our approach delivers significant accuracy gains while only using 10% more VRAM than BitsandBytes 4-bit. Our tests show that standard 4-bit quants performed much worse than the original 16-bit versions while Unsloths Dynamic 4-bit quants provided very accurate & reliable results. Read our Blog: Dynamic 4-bit Quants on @HuggingFace: Colab notebook:" [X Link](https://x.com/UnslothAI/status/1864384960666456535) 2024-12-04T19:03Z 12.2K followers, 45.4K engagements "Llama [---] versions including GGUF's + bnb 4-bit + original 16-bit are now on @HuggingFace See all versions of Llama [---] here: Fine-tuning for Llama [---] (70B) is also now supported Unsloth is 2x faster and uses 70% less memory. https://huggingface.co/collections/unsloth/llama-33-all-versions-67535d7d994794b9d7cf5e9f https://huggingface.co/collections/unsloth/llama-33-all-versions-67535d7d994794b9d7cf5e9f https://huggingface.co/collections/unsloth/llama-33-all-versions-67535d7d994794b9d7cf5e9f https://huggingface.co/collections/unsloth/llama-33-all-versions-67535d7d994794b9d7cf5e9f" [X Link](https://x.com/UnslothAI/status/1865151062023512485) 2024-12-06T21:47Z 22.4K followers, 40.3K engagements "Llama [---] fine-tuning with ultra long context is here π¦ Unsloth now supports 89K context for @AIatMeta's Llama [---] (70B) on a 80GB GPU - 13x longer than HF+FA2 For Llama [---] (8B) Unsloth enables 342K context surpassing its native 128K support Blog: https://unsloth.ai/blog/llama3-3 https://unsloth.ai/blog/llama3-3 https://unsloth.ai/blog/llama3-3 https://unsloth.ai/blog/llama3-3" [X Link](https://x.com/UnslothAI/status/1866545164140810603) 2024-12-10T18:07Z [----] followers, 33.9K engagements "@JoshPurtell @danielhanchen @6___0 Qwen QwQ is already supported. You just need to create your own dataset. π" [X Link](https://x.com/UnslothAI/status/1870567553585860702) 2024-12-21T20:30Z [----] followers, [--] engagements "Learn how to fine-tune Llama for free in [--] mins In this video @jasonzhou1993 uses Unsloth to fine-tune Llama [---] (3B) with a custom dataset to significantly enhance MidJourney prompts. Jason covers the A-Z of fine-tuning including data prep with synthetic data evaluation free Colab training using Unsloth deployment & more in the full video Colab notebook: Documentation: Full video: https://www.youtube.com/watchv=jFl5Fewrieo https://docs.unsloth.ai/ https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9usp=sharing https://www.youtube.com/watchv=jFl5Fewrieo" [X Link](https://x.com/UnslothAI/status/1874146501019963821) 2024-12-31T17:32Z [----] followers, 22.5K engagements "Deepseek V3 including GGUF + bf16 versions are now on @HuggingFace Min. requirements to run: 48GB RAM + 250GB of disk space for 2-bit. Includes [--] [--] [--] [--] [--] and 8-bit quantized versions. See all versions of Deepseek V3 & how to run it with examples: https://huggingface.co/collections/unsloth/deepseek-v3-all-versions-677cf5cfd7df8b7815fc723c https://huggingface.co/collections/unsloth/deepseek-v3-all-versions-677cf5cfd7df8b7815fc723c https://huggingface.co/collections/unsloth/deepseek-v3-all-versions-677cf5cfd7df8b7815fc723c" [X Link](https://x.com/UnslothAI/status/1876729710790815872) 2025-01-07T20:36Z 12K followers, 42.4K engagements "Phi-4 including GGUF + 4-bit + 16-bit versions are now on @HuggingFace We found & fixed [--] bugs in Phi-4 & Llamafied the model. View all Phi-4 versions with our bug fixes: Phi-4 fine-tuning is also supported Unsloth is 2x faster & uses 70% less VRAM. https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa" [X Link](https://x.com/UnslothAI/status/1877136074042126338) 2025-01-08T23:31Z [----] followers, 21.9K engagements "You can finetune Phi-4 for free on Colab now Unsloth finetunes LLMs 2x faster with 70% less VRAM 12x longer context - with no accuracy loss. GitHub repo: Documentation: Phi-4 Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Phi_4-Conversational.ipynb https://docs.unsloth.ai https://github.com/unslothai/unsloth https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Phi_4-Conversational.ipynb https://docs.unsloth.ai https://github.com/unslothai/unsloth" [X Link](https://x.com/UnslothAI/status/1877779176473944212) 2025-01-10T18:06Z 14.8K followers, 60.9K engagements "You can finetune Phi-4 for free on @Kaggle now You'll learn how to: Prepare your dataset Train Phi-4 via Kaggle's free GPUs Run evaluate & save your model Unsloth finetunes LLMs 2x faster with 70% less VRAM & no accuracy loss. Phi-4 notebook: https://www.kaggle.com/code/danielhanchen/phi-4-unsloth-notebook https://www.kaggle.com/code/danielhanchen/phi-4-unsloth-notebook" [X Link](https://x.com/UnslothAI/status/1879942441538609583) 2025-01-16T17:23Z [----] followers, 26K engagements "@levelsio DeepSeek R1 Distill Llama 8B seems to be the current most popular R1 GGUF and it will definitely run great on your laptop. We uploaded ALL of the GGUF files & they can be directly used with Jan AI llama.cpp Ollama HF etc: https://x.com/UnslothAI/status/1881357596717891955 DeepSeek-R1 GGUF's are now on @HuggingFace Includes all Llama & Qwen distilled models + [--] to 8-bit quantized versions. How to run R1: https://t.co/Ci22Tiu6fb DeepSeek-R1 Collection: https://t.co/JfVV5EA6qO https://x.com/UnslothAI/status/1881357596717891955 DeepSeek-R1 GGUF's are now on @HuggingFace Includes all" [X Link](https://x.com/UnslothAI/status/1882041892344611126) 2025-01-22T12:25Z [----] followers, [----] engagements "@0xAsharib @tom_doerr @deepseek_ai We do but only for the distilled versions. You can read more in our blog here: https://unsloth.ai/blog/deepseek-r1 https://unsloth.ai/blog/deepseek-r1" [X Link](https://x.com/UnslothAI/status/1883619148519096575) 2025-01-26T20:52Z 11.2K followers, [--] engagements "Introducing 1.58bit DeepSeek-R1 GGUFs π DeepSeek-R1 can now run in 1.58-bit while being fully functional. We shrank the 671B parameter model from 720GB to just 131GB - a 80% size reduction. Naively quantizing all layers breaks the model entirely causing endless loops & gibberish outputs. Our dynamic quants solve this. The 1.58-bit quant fits in 160GB VRAM (2x H100 80GB) for fast inference at [---] tokens/sec. By studying DeepSeek-R1s architecture we selectively quantized certain layers to higher bits (like 4-bit) and leave most MoE layers to 1.5-bit. Benchmarks + Blog: Dynamic GGUFs" [X Link](https://x.com/UnslothAI/status/1883899061893546254) 2025-01-27T15:25Z 19.9K followers, 685.1K engagements "Run DeepSeek-R1 (671B) locally on @OpenWebUI - Full Guide No GPU required. Using our 1.58-bit Dynamic GGUF and llama.cpp. Tutorial: https://docs.openwebui.com/tutorials/integrations/deepseekr1-dynamic/ https://docs.openwebui.com/tutorials/integrations/deepseekr1-dynamic/ https://docs.openwebui.com/tutorials/integrations/deepseekr1-dynamic/ https://docs.openwebui.com/tutorials/integrations/deepseekr1-dynamic/" [X Link](https://x.com/UnslothAI/status/1885404089200369846) 2025-01-31T19:05Z 18.8K followers, 66.4K engagements "@vega_holdings @OpenWebUI Oooh maybe really depends on demand. Issue is the imatrix quants will take a lot of time and money but we'll see. We might release it for V3 first - or maybe not :)" [X Link](https://x.com/UnslothAI/status/1885503929376792724) 2025-02-01T01:42Z 12.2K followers, [----] engagements "Unsloth is the #1 trending repo on GitHub π¦₯ Its been an incredible journey and we couldnt have done it without you To celebrate were taking a look back at how it all started and how we got here: GitHub repo: http://github.com/unslothai/unsloth http://unsloth.ai/blog/reintroducing http://github.com/unslothai/unsloth http://unsloth.ai/blog/reintroducing" [X Link](https://x.com/UnslothAI/status/1889000210371932398) 2025-02-10T17:15Z 18.4K followers, 41.6K engagements "Train your own reasoning LLM using DeepSeek's GRPO algorithm with our free notebook You'll transform Llama [---] (8B) to have chain-of-thought. Unsloth makes GRPO use 80% less VRAM. Guide: GitHub: Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/reasoning-grpo https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/reasoning-grpo" [X Link](https://x.com/UnslothAI/status/1889726411478278183) 2025-02-12T17:21Z 18.7K followers, 101.8K engagements "Today were launching new algorithms that enable 10x longer context lengths & 90% less VRAM for training Reasoning Models (GRPO). Using Unsloth you can now train your own reasoning model with just 5GB VRAM for Qwen2.5-1.5B with no accuracy loss. Blog: https://unsloth.ai/blog/grpo https://unsloth.ai/blog/grpo https://unsloth.ai/blog/grpo https://unsloth.ai/blog/grpo" [X Link](https://x.com/UnslothAI/status/1892640995847901684) 2025-02-20T18:22Z 21.5K followers, 157.4K engagements "Tutorial: Train your own Reasoning LLM for free Make Llama [---] (8B) have chain-of-thought with DeepSeek's GRPO. Unsloth enables 90% less VRAM use. Learn about: Reward Functions + dataset prep Training on free Colab GPUs Run + Evaluating Guide: https://docs.unsloth.ai/basics/reasoning-grpo-and-rl/tutorial-train-your-own-reasoning-model-with-grpo https://docs.unsloth.ai/basics/reasoning-grpo-and-rl/tutorial-train-your-own-reasoning-model-with-grpo https://docs.unsloth.ai/basics/reasoning-grpo-and-rl/tutorial-train-your-own-reasoning-model-with-grpo" [X Link](https://x.com/UnslothAI/status/1894437705724924033) 2025-02-25T17:22Z 22K followers, 62.7K engagements "Unsloth now works on Windows π¦₯ Fine-tune LLMs locally on Windows without Linux or WSL. Tutorial: https://docs.unsloth.ai/get-started/installing-+-updating/windows-installation https://docs.unsloth.ai/get-started/installing-+-updating/windows-installation https://docs.unsloth.ai/get-started/installing-+-updating/windows-installation https://docs.unsloth.ai/get-started/installing-+-updating/windows-installation" [X Link](https://x.com/UnslothAI/status/1897334290935132602) 2025-03-05T17:12Z 20.6K followers, 32.2K engagements "We made a Guide to teach you how to Fine-tune LLMs correctly Learn about: Choosing the right parameters & training method RL GRPO DPO & CPT Data prep Overfitting & Evaluation Training with Unsloth & deploy on vLLM Ollama Open WebUI π https://docs.unsloth.ai/get-started/fine-tuning-guide https://docs.unsloth.ai/get-started/fine-tuning-guide" [X Link](https://x.com/UnslothAI/status/1899132219064766652) 2025-03-10T16:16Z 20.6K followers, 93K engagements "You can now fine-tune Gemma [--] for free with our notebook Unsloth makes Gemma [--] finetuning 1.6x faster with 60% less VRAM and 6x longer context lengths - with no accuracy loss. Blogpost: GitHub: Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3_(4B).ipynb https://github.com/unslothai/unsloth https://unsloth.ai/blog/gemma3 https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3_(4B).ipynb https://github.com/unslothai/unsloth https://unsloth.ai/blog/gemma3" [X Link](https://x.com/UnslothAI/status/1900609121742979159) 2025-03-14T18:05Z 20.2K followers, 128.5K engagements "We teamed up with @HuggingFace to release a free notebook for fine-tuning Gemma [--] with GRPO Learn to: Enable reasoning in Gemma [--] (1B) Prepare/understand reward functions Make GRPO work for tiny LLMs Notebook: Details: https://huggingface.co/reasoning-course https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/HuggingFace%20Course-Gemma3_(1B)-GRPO.ipynb https://huggingface.co/reasoning-course https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/HuggingFace%20Course-Gemma3_(1B)-GRPO.ipynb https://huggingface.co/reasoning-course" [X Link](https://x.com/UnslothAI/status/1902396234884903254) 2025-03-19T16:26Z 21.8K followers, 92.7K engagements "You can now Run DeepSeek-V3-0324 locally using our 2.71-bit Dynamic GGUF We shrank 720GB to 231GB (-70%) by selectively quantizing layers. 2.71bit passes many code tests producing nearly identical results to full 8bit Guide GGUF https://huggingface.co/unsloth/DeepSeek-V3-0324-GGUF https://docs.unsloth.ai/basics/tutorial-how-to-run-deepseek-v3-0324-locally https://huggingface.co/unsloth/DeepSeek-V3-0324-GGUF https://docs.unsloth.ai/basics/tutorial-how-to-run-deepseek-v3-0324-locally" [X Link](https://x.com/UnslothAI/status/1904717086041268676) 2025-03-26T02:08Z 25.9K followers, 47.7K engagements "Another example including standard 2-bit which produces broken code. The 2.71-bit quant fits in 231GB VRAM (3 H100s) for fast throughput inference at [---] tokens/s. We also uploaded 1.78-bit etc. quants but for best results use our [----] or 2.71-bit quants. To run have at least 160GB combined VRAM + RAM. By studying V3s architecture we selectively quantize layers to higher bits (like 4-bit) and leave other MoE layers to lower bits (2.5-bit)" [X Link](https://x.com/UnslothAI/status/1904777589514027402) 2025-03-26T06:09Z 23.3K followers, [----] engagements "We made a Guide on how to create Datasets for Fine-tuning Learn to: Curate high-quality datasets (with best practices & examples) Format datasets correctly for conversation SFT GRPO Vision etc. Generate synthetic data with Llama & ChatGPT https://docs.unsloth.ai/basics/datasets-guide https://docs.unsloth.ai/basics/datasets-guide https://docs.unsloth.ai/basics/datasets-guide https://docs.unsloth.ai/basics/datasets-guide" [X Link](https://x.com/UnslothAI/status/1912162246345507229) 2025-04-15T15:13Z 23.1K followers, 56.3K engagements "Microsoft releases Phi-4 reasoning models You can now run them locally with our Dynamic GGUFs. Phi-4-reasoning-plus is only 14B parameters but performs on par with o1-mini o3-mini and Sonnet [---]. GGUFs: https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa Weve been cooking. a new open weights 14B Phi-4 reasoning model SFTd on 1.4M carefully curated reasoning demonstrations from o3-mini and RLd for a tiny bit. This model is a little beast. https://t.co/4xJuvYpZBH" [X Link](https://x.com/UnslothAI/status/1917806961825046672) 2025-05-01T05:03Z 23.4K followers, 83.2K engagements "You can now fine-tune Qwen3 (14B) for free with our notebook Unsloth makes Qwen3 finetuning 2x faster with 70% less VRAM and 8x longer context lengths - with no accuracy loss. Guide: GitHub: Colab: https://colab.research.google.com/drive/1_ZJD6xqYDvhRbKSQeV8pThLBphcVB9Wnusp=sharing https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/qwen3-how-to-run-and-fine-tune https://colab.research.google.com/drive/1_ZJD6xqYDvhRbKSQeV8pThLBphcVB9Wnusp=sharing https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/qwen3-how-to-run-and-fine-tune" [X Link](https://x.com/UnslothAI/status/1918335648764989476) 2025-05-02T16:03Z 24K followers, 191.2K engagements "Thank you @Google for demoing how to fine-tune Gemma [--] with Unsloth free on Colab π¦₯ #GoogleIO" [X Link](https://x.com/UnslothAI/status/1924977922915631443) 2025-05-20T23:57Z 24.4K followers, 18.1K engagements "Mistral releases Devstral a new model for coding agents. Devstral-Small-2505 is now the #1 open-source LLM on SWE-Bench Verified. At 24B params & built with All Hands it scores 46.8% on SWE-Bench V - beating GPT-4.1-mini. Run & finetune via our GGUFs: https://huggingface.co/unsloth/Devstral-Small-2505-GGUF https://huggingface.co/unsloth/Devstral-Small-2505-GGUF Meet Devstral our SOTA open model designed specifically for coding agents and developed with @allhands_ai https://t.co/LwDJ04zapf https://t.co/Mm4lYZobGO https://huggingface.co/unsloth/Devstral-Small-2505-GGUF" [X Link](https://x.com/UnslothAI/status/1925208355968303150) 2025-05-21T15:13Z 24.5K followers, 20.5K engagements "We just crossed [--] million monthly downloads on @HuggingFace π¦₯π€ It's all thanks to you guys - the amazing community model builders and HF team π" [X Link](https://x.com/UnslothAI/status/1927727485246165200) 2025-05-28T14:03Z 24.9K followers, 29.4K engagements "You can now run DeepSeek-R1-0528 with our Dynamic 1-bit GGUFs π We shrank the full 715GB model to just 185GB (-75% size). We achieve optimal accuracy by selectively quantizing layers. DeepSeek-R1-0528-Qwen3-8B is also supported. GGUFs: https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF" [X Link](https://x.com/UnslothAI/status/1928257120321032289) 2025-05-30T01:08Z 25.2K followers, 46.2K engagements "Mistral releases Small [---] (24B) a new update to their [---] model. π₯ The model performs much better on 5-shot MMLU (CoT) instruction following and function/tool calling Run locally with FP8 or 16GB RAM using our Dynamic GGUFs with fixed chat template: https://huggingface.co/unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF https://huggingface.co/unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF https://huggingface.co/unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF https://huggingface.co/unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF" [X Link](https://x.com/UnslothAI/status/1936426567850487925) 2025-06-21T14:10Z 25.9K followers, 21.6K engagements "We made a Guide on mastering LoRA Hyperparameters so you can learn to fine-tune LLMs correctly Learn to: Train smarter models with fewer hallucinations Choose optimal: learning rates epochs LoRA rank alpha Avoid overfitting & underfitting https://docs.unsloth.ai/get-started/fine-tuning-guide/lora-hyperparameters-guide https://docs.unsloth.ai/get-started/fine-tuning-guide/lora-hyperparameters-guide https://docs.unsloth.ai/get-started/fine-tuning-guide/lora-hyperparameters-guide https://docs.unsloth.ai/get-started/fine-tuning-guide/lora-hyperparameters-guide" [X Link](https://x.com/UnslothAI/status/1937521408344752272) 2025-06-24T14:41Z 26.1K followers, 25K engagements "Run Gemma 3n locally with our Dynamic GGUFsβ¨ @Google's Gemma 3n supports audio vision video & text and the 4B model fits on 8GB RAM for fast local inference. Fine-tuning is also supported in Unsloth. Gemma-3n-E4B GGUF: https://huggingface.co/unsloth/gemma-3n-E4B-it-GGUF https://huggingface.co/unsloth/gemma-3n-E4B-it-GGUF Im so excited to announce Gemma 3n is here π πMultimodal (text/audio/image/video) understanding π€―Runs with as little as 2GB of RAM πFirst model under 10B with @lmarena_ai score of 1300+ Available now on @huggingface @kaggle llama.cpp https://t.co/CNDy479EEv and more" [X Link](https://x.com/UnslothAI/status/1938278293486309554) 2025-06-26T16:48Z 26.1K followers, 35.4K engagements "You can now fine-tune Gemma 3n for free with our notebook Unsloth makes Google Gemma training 1.5x faster with 50% less VRAM and 5x longer context lengths - with no accuracy loss. Guide: GitHub: Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3N_(4B)-Conversational.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/gemma-3n-how-to-run-and-fine-tune#fine-tuning-gemma-3n-with-unsloth https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3N_(4B)-Conversational.ipynb https://github.com/unslothai/unsloth" [X Link](https://x.com/UnslothAI/status/1940070928588972269) 2025-07-01T15:32Z 31.6K followers, 87.5K engagements "Weve teamed up with @GoogleDeepMind for a challenge with a $10000 Unsloth prize π¦₯ Show off your best fine-tuned Gemma 3n model using Unsloth optimized for an impactful task. The entire hackathon has $150000 prizes to be won Kaggle notebook: https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference" [X Link](https://x.com/UnslothAI/status/1940414492791468240) 2025-07-02T14:17Z 31.6K followers, 47.8K engagements "The Unsloth Gemma 3n Kaggle notebook can be used for any submission to the $150000 challenges (not just the Unsloth specific one). Gemma 3n competition details: https://www.kaggle.com/competitions/google-gemma-3n-hackathon https://www.kaggle.com/competitions/google-gemma-3n-hackathon" [X Link](https://x.com/UnslothAI/status/1940422698028638717) 2025-07-02T14:49Z 31.5K followers, [----] engagements "We made step-by-step guides to Fine-tune & Run every single LLM π¦₯ What you'll learn: Technical analysis + Bug fixes explained for each model Best practices & optimal settings How to fine-tune with our notebooks Directory of model variants https://docs.unsloth.ai/basics/tutorials-how-to-fine-tune-and-run-llms https://docs.unsloth.ai/basics/tutorials-how-to-fine-tune-and-run-llms https://docs.unsloth.ai/basics/tutorials-how-to-fine-tune-and-run-llms https://docs.unsloth.ai/basics/tutorials-how-to-fine-tune-and-run-llms" [X Link](https://x.com/UnslothAI/status/1942581217838211543) 2025-07-08T13:47Z 27.8K followers, 50.8K engagements "Mistral releases Devstral [----] the best open-source model for coding agents π₯ The 24B model is now the #1 open LLM on SWE-Bench Verified scoring 52.4% Run Devstral-Small-2507 locally on 32GB RAM with our Dynamic quants & fine-tune with Unsloth GGUFs: https://huggingface.co/unsloth/Devstral-Small-2507-GGUF https://huggingface.co/unsloth/Devstral-Small-2507-GGUF Introducing Devstral Small and Medium [----] This latest update offers improved performance and cost efficiency perfectly suited for coding agents and software engineering tasks. https://t.co/l6MacctLrv" [X Link](https://x.com/UnslothAI/status/1943317113655189557) 2025-07-10T14:31Z 26.7K followers, 24.8K engagements "You can now run Kimi K2 locally with our Dynamic 1.8-bit GGUFs We shrank the full 1.1TB model to just 245GB (-80% size reduction). The 2-bit XL GGUF performs exceptionally well on coding & passes all our code tests Guide: GGUFs: https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF https://docs.unsloth.ai/basics/kimi-k2 https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF https://docs.unsloth.ai/basics/kimi-k2 https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF https://docs.unsloth.ai/basics/kimi-k2 https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF https://docs.unsloth.ai/basics/kimi-k2" [X Link](https://x.com/UnslothAI/status/1944780685409165589) 2025-07-14T15:27Z 27.8K followers, 128.9K engagements "For fast inference of 5+ tokens/s try to have your RAM + VRAM combined = the size of quant (e.g. 256GB). If not the model will still run with llama.cpp offloading but be slower. Kimi K2 GGUF: https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF" [X Link](https://x.com/UnslothAI/status/1944798892811485670) 2025-07-14T16:39Z 27.1K followers, [----] engagements "A Complete Guide to Fine-tuning LLMs in [--] mins Learn to: Choose the correct model & training method (LoRA FFT GRPO) Build Datasets & Chat templates Train with Unsloth notebooks Run & deploy your LLM in llama.cpp Ollama & Open WebUI Docs: https://docs.unsloth.ai/ https://docs.unsloth.ai/" [X Link](https://x.com/UnslothAI/status/1945481829206905055) 2025-07-16T13:53Z 28.7K followers, 30.7K engagements "@Alibaba_Qwen Congrats guys on the release β¨ We're working on Dynamic quants & GGUFs so the community can run it locally π€" [X Link](https://x.com/UnslothAI/status/1947355536459989209) 2025-07-21T17:58Z 27.3K followers, 12.4K engagements "You can now run Qwen3-235B-A22B-2507 with our Dynamic 2-bit GGUFs The full 250GB model gets reduced to just 88GB (-65% size). Achieve [--] tokens/s on 89GB unified memory or 80GB RAM + 8GB VRAM. GGUFs: https://huggingface.co/unsloth/Qwen3-235B-A22B-Instruct-2507-GGUF https://huggingface.co/unsloth/Qwen3-235B-A22B-Instruct-2507-GGUF Bye Qwen3-235B-A22B hello Qwen3-235B-A22B-2507 After talking with the community and thinking it through we decided to stop using hybrid thinking mode. Instead well train Instruct and Thinking models separately so we can get the best quality possible. Today were" [X Link](https://x.com/UnslothAI/status/1947635542436598271) 2025-07-22T12:31Z 27.8K followers, 37.3K engagements "@Alibaba_Qwen Congrats guys on another epic release We're uploading Dynamic GGUFs and one with 1M context length so you guys can run it locally π¦₯ https://huggingface.co/unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF https://huggingface.co/unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF" [X Link](https://x.com/UnslothAI/status/1947768806165909820) 2025-07-22T21:20Z 27.8K followers, 17.2K engagements "Run Qwen3-Coder with our Dynamic 2-bit GGUFs We shrank the 480B parameter model to just 182GB (down from 512GB). Also run with 1M context length. Achieve [--] tokens/s on 182GB unified memory or 158GB RAM + 24GB VRAM. Qwen3-Coder-480B-A35B GGUFs: https://huggingface.co/unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF https://huggingface.co/unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF Qwen3-Coder is here β Were releasing Qwen3-Coder-480B-A35B-Instruct our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to" [X Link](https://x.com/UnslothAI/status/1947830633117716640) 2025-07-23T01:26Z 27.8K followers, 25.2K engagements "@Alibaba_Qwen Congrats guys You can run Qwen3-235B-A22B-Thinking-2507 with our Dynamic GGUFs π₯° Run in 2-bit with 88GB unified mem or RAM for 6+ tokens/s. https://huggingface.co/unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF https://huggingface.co/unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF" [X Link](https://x.com/UnslothAI/status/1948691802728661224) 2025-07-25T10:28Z 27.8K followers, 10.6K engagements "You can now run Qwen3-235B-A22B-Thinking-2507 with our Dynamic 2-bit GGUFs The full 250GB model gets reduced to just 87GB (-65% size). Achieve [--] tokens/s on 88GB unified memory or 80GB RAM + 8GB VRAM. GGUFs: https://huggingface.co/unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF https://huggingface.co/unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF π Were excited to introduce Qwen3-235B-A22B-Thinking-2507 our most advanced reasoning model yet Over the past [--] months weve significantly scaled and enhanced the thinking capability of Qwen3 achieving: β Improved performance in logical reasoning math science" [X Link](https://x.com/UnslothAI/status/1948693451492442365) 2025-07-25T10:34Z 28.1K followers, 19.6K engagements "@Alibaba_Qwen Thanks for releasing a smaller model guys π₯° You can now run Qwen3-30B-A3B-0527 using Dynamic GGUFs. Only 33GB RAM or unified mem is needed to run the full 8-bit precision model at [--] tokens/s. https://huggingface.co/unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF https://huggingface.co/unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF" [X Link](https://x.com/UnslothAI/status/1950230615892377934) 2025-07-29T16:23Z 27.7K followers, [----] engagements "Qwen3-30B-A3B-Instruct-2507 is hereβ¨ The 30B model rivals GPT-4o's performance and runs locally in full precision with just 33GB RAM. Run locally with Unsloth Dynamic GGUFs. Unsloth also supports Qwen3 fine-tuning and RL. GGUF: https://huggingface.co/unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF π Qwen3-30B-A3B Small Update: Smarter faster and local deployment-friendly. β¨ Key Enhancements: β Enhanced reasoning coding and math skills β Broader multilingual knowledge β Improved long-context understanding (up to 256K tokens) β Better alignment with user intent https://t.co/zsKfKJ2NRG" [X Link](https://x.com/UnslothAI/status/1950233007933313319) 2025-07-29T16:32Z 29.7K followers, 57.8K engagements "The @GoogleDeepMind Gemma 3n Challenge ($150000 in prizes) ends in [--] days We've made [--] new fine-tuning Gemma 3n Kaggle notebooks (Vision & Audio) to spark your creativity. Your fine-tuned model can compete for any prize. Notebooks + Challenge Details: https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference" [X Link](https://x.com/UnslothAI/status/1950553466591723640) 2025-07-30T13:46Z 28.8K followers, 10.9K engagements "Qwen3-Coder-Flash is hereβ¨ The 30B model excels in coding & agentic tasks. Run locally with up to 1M context length. Full precision runs with just 33GB RAM. We also fixed tool-calling support for Qwen3-Coder-30B-A3B-Instruct and 480B-A3B. GGUFs: https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF π¦₯ Qwen3-Coder-Flash: Qwen3-Coder-30B-A3B-Instruct π Just lightning-fast accurate code generation. β Native 256K context (supports up to 1M tokens with YaRN) β Optimized for platforms like Qwen Code Cline Roo Code Kilo" [X Link](https://x.com/UnslothAI/status/1950935044387684478) 2025-07-31T15:02Z 32.3K followers, 44.6K engagements "@Alibaba_Qwen You can still run any quant without meeting the 33GB RAM requirements - it'll just be slower. Maximum memory is only needed for the optimal speeds or more context. Here's the link to the 1M context Qwen3-Coder-Flash GGUF: https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-1M-GGUF https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-1M-GGUF" [X Link](https://x.com/UnslothAI/status/1950940422563623235) 2025-07-31T15:23Z 28.8K followers, [----] engagements "@mrgshum @OpenAI You can still run it with 64GB RAM it'll just be slower. Remember you can run any model size no matter how much compute you have. We're also working on smaller quants once llama.cpp adds in support for it :)" [X Link](https://x.com/UnslothAI/status/1952827651057713255) 2025-08-05T20:22Z 28.8K followers, [----] engagements "@sama Thank you guys for supporting open-source You can now run the 20B and 120B models locally with our GGUFs π₯° https://huggingface.co/unsloth/gpt-oss-20b-GGUF https://huggingface.co/unsloth/gpt-oss-20b-GGUF" [X Link](https://x.com/UnslothAI/status/1952842401212698701) 2025-08-05T21:21Z 28.8K followers, 10.6K engagements "You can now fine-tune OpenAI gpt-oss for free with our notebook Unsloth trains 1.5x faster with -70% VRAM 10x longer context & no accuracy loss. 20b fits in 14GB & 120b in 65GB GPU. Guide: GitHub: Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-Fine-tuning.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/gpt-oss https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-Fine-tuning.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/gpt-oss" [X Link](https://x.com/UnslothAI/status/1953896997867729075) 2025-08-08T19:12Z 31.9K followers, 196.1K engagements "Learn to fine-tune OpenAI gpt-oss with our new step-by-step guide Learn about: Local gpt-oss training + inference FAQ & tips Evaluation hyperparameters & overfitting Reasoning effort Data prep Run & saving your LLM to llama.cpp GGUF HF https://docs.unsloth.ai/basics/tutorial-how-to-fine-tune-gpt-oss https://docs.unsloth.ai/basics/tutorial-how-to-fine-tune-gpt-oss https://docs.unsloth.ai/basics/tutorial-how-to-fine-tune-gpt-oss https://docs.unsloth.ai/basics/tutorial-how-to-fine-tune-gpt-oss" [X Link](https://x.com/UnslothAI/status/1957442965070397675) 2025-08-18T14:02Z 31.2K followers, 41.9K engagements "@QuixiAI @deepseek_ai Thanks Eric & everyone we really appreciate the support Huge thanks to @ggerganov and the llama.cpp team for making this possible as well and of course to the DeepSeek team π₯°" [X Link](https://x.com/UnslothAI/status/1959390128570544160) 2025-08-23T22:59Z 30.7K followers, [---] engagements "@elonmusk @xai Thanks for supporting open-source We'll try to investigate how we can create Dynamic GGUFs so everyone can run it locally π" [X Link](https://x.com/UnslothAI/status/1959391483104289120) 2025-08-23T23:05Z 30.7K followers, 50.1K engagements "RL used to be memory hungry but not anymore Introducing our new kernels & algos that allows faster RL with 50% less VRAM [--] more context & no accuracy loss. RL before required GPU splitting between training & inference. Now with Standby you don't http://docs.unsloth.ai/basics/memory-efficient-rl http://docs.unsloth.ai/basics/memory-efficient-rl" [X Link](https://x.com/UnslothAI/status/1963633792695889924) 2025-09-04T16:02Z 31.7K followers, 69.5K engagements "You can now run @xAI Grok [---] locally on just 120GB RAM π The 270B parameter model runs [--] t/s on a 128GB Mac with our Dynamic 3-bit GGUF. We shrunk the 539GB model to 118GB (-80%) & left key layers in higher 8-bits Guide: GGUF: https://huggingface.co/unsloth/grok-2-GGUF https://docs.unsloth.ai/basics/grok-2 https://huggingface.co/unsloth/grok-2-GGUF https://docs.unsloth.ai/basics/grok-2 https://huggingface.co/unsloth/grok-2-GGUF https://docs.unsloth.ai/basics/grok-2 https://huggingface.co/unsloth/grok-2-GGUF https://docs.unsloth.ai/basics/grok-2" [X Link](https://x.com/UnslothAI/status/1965047729991860396) 2025-09-08T13:41Z 31.9K followers, 109K engagements "Unsloth Dynamic GGUFs were introduced early this year where we selectively quantized some layers to as low as 1-bit and important layers to higher bits (6 8-bit). Blog post: Our Dynamic GGUFs consistently performs better on Aider Polyglot when compared to other community quants for the same model size and quant type. To ensure a fair comparison we do the following: We select similar sized files and bit types to each Unsloth quant. We use our fixed chat template if the community quant fails to execute the benchmark. We found some community quants having errors and this gets fixed by using our" [X Link](https://x.com/UnslothAI/status/1965797781156901151) 2025-09-10T15:21Z 31.6K followers, [----] engagements "You can now train Vision LLMs with Reinforcement Learning in our free notebook Unsloth VLM RL via GRPO: [---] faster 90% less VRAM [--] longer context & no accuracy loss. Guide: GitHub: Qwen2.5-VL Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2_5_7B_VL_GRPO.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/new/vision-reinforcement-learning-vlm-rl https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2_5_7B_VL_GRPO.ipynb https://github.com/unslothai/unsloth" [X Link](https://x.com/UnslothAI/status/1967987928229199873) 2025-09-16T16:24Z 32.8K followers, 142.2K engagements "@Alibaba_Qwen We're all super excited for Qwen3-VL π₯°π" [X Link](https://x.com/UnslothAI/status/1968164908174041501) 2025-09-17T04:07Z 31.6K followers, [----] engagements "Mistral releases Magistral [---] their new reasoning models π₯ Magistral-Small-2509 excels at coding + math and is a major upgrade over Magistral [---]. Run the 24B model locally with 32GB RAM. Fine-tune with free notebook: GGUFs: https://huggingface.co/unsloth/Magistral-Small-2509-GGUF https://docs.unsloth.ai/models/magistral-how-to-run-and-fine-tune#fine-tuning-magistral-with-unsloth https://huggingface.co/unsloth/Magistral-Small-2509-GGUF https://docs.unsloth.ai/models/magistral-how-to-run-and-fine-tune#fine-tuning-magistral-with-unsloth https://huggingface.co/unsloth/Magistral-Small-2509-GGUF" [X Link](https://x.com/UnslothAI/status/1968343162923175949) 2025-09-17T15:55Z 32.4K followers, 51.2K engagements "@deepseek_ai Thank you for another update We're excited to make Dynamic GGUFs so you all can run it locally π https://huggingface.co/unsloth/DeepSeek-V3.1-Terminus-GGUF https://huggingface.co/unsloth/DeepSeek-V3.1-Terminus-GGUF" [X Link](https://x.com/UnslothAI/status/1970121459298398632) 2025-09-22T13:42Z 32K followers, 20.8K engagements "We're teaming up with @MistralAI and @NVIDIA for an Unsloth event on Tues Oct [--] at @YCombinator's office π¦₯ Join us in San Francisco for a night of talks merch and more. Food & drinks provided. RSVP required http://lu.ma/unsloth-yc http://lu.ma/unsloth-yc http://lu.ma/unsloth-yc http://lu.ma/unsloth-yc" [X Link](https://x.com/UnslothAI/status/1970127303071211764) 2025-09-22T14:05Z 32.3K followers, 30.2K engagements "You can now train OpenAI gpt-oss with Reinforcement Learning in our free notebook This notebook automatically creates faster kernels via RL. Unsloth RL achieves the fastest inference & lowest VRAM vs. any setup - [--] accuracy loss gpt-oss-20b GRPO Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-GRPO.ipynb https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-GRPO.ipynb https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-GRPO.ipynb" [X Link](https://x.com/UnslothAI/status/1971602146270580857) 2025-09-26T15:45Z 32.7K followers, 123.4K engagements "The notebook shows how to counteract reward-hacking which is one of RL's biggest challenges. Blog + details: Since inference is crucial and vLLM is incompatible with gpt-oss RL we developed custom algorithms in Unsloth to deliver the fastest inference (3 faster) the lowest VRAM usage (50% less) and longest context lengths (8 more) - without any accuracy degradation. https://docs.unsloth.ai/new/gpt-oss-reinforcement-learning https://docs.unsloth.ai/new/gpt-oss-reinforcement-learning https://docs.unsloth.ai/new/gpt-oss-reinforcement-learning" [X Link](https://x.com/UnslothAI/status/1971602149600829724) 2025-09-26T15:45Z 32.6K followers, [----] engagements "Join us @Pytorch and @AMD for a Virtual Hackathon on Oct 18-20. π₯ Win $10K in prizes by training the best AI agent via Unsloth Sign up here: https://luma.com/4i64p3ec https://luma.com/4i64p3ec" [X Link](https://x.com/UnslothAI/status/1971986078430384310) 2025-09-27T17:11Z 32.4K followers, [---] engagements "@deepseek_ai Thank you guys once again for supporting open-source and making AI more accessible Hopefully we'll be able to make GGUFs to allow everyone to run DeepSeek-V3.2-Exp locally π" [X Link](https://x.com/UnslothAI/status/1972637991756824706) 2025-09-29T12:22Z 32.6K followers, 21.7K engagements "LoRA in reinforcement learning (RL) can match full-finetuning performance when done right π‘ A new @thinkymachines post shows how using 10x larger learning rates applying LoRA on all layers & more LoRA at rank=1 even works. We're excited to have collaborated on this blog LoRA makes fine-tuning more accessible but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post we share our experimental results and recommendations for LoRA. https://t.co/DcVmUKeOyw LoRA makes fine-tuning" [X Link](https://x.com/UnslothAI/status/1972768092817379385) 2025-09-29T20:59Z 32.7K followers, 61.7K engagements "IBM releases Granite-4.0 their new series of open models Run the 'Micro' 3B model on 4GB RAM or 'Small' 32B on 40GB RAM. Granite-4.0 excels at agentic tasks doc analysis RAG edge AI applications & more Dynamic GGUFs: Guide: https://docs.unsloth.ai/new/ibm-granite-4.0 https://huggingface.co/collections/unsloth/granite-40-68ddf64b4a8717dc22a9322d https://docs.unsloth.ai/new/ibm-granite-4.0 https://huggingface.co/collections/unsloth/granite-40-68ddf64b4a8717dc22a9322d https://docs.unsloth.ai/new/ibm-granite-4.0 https://huggingface.co/collections/unsloth/granite-40-68ddf64b4a8717dc22a9322d" [X Link](https://x.com/UnslothAI/status/1973753481636045253) 2025-10-02T14:14Z 32.8K followers, 42.7K engagements "@Alibaba_Qwen Go Qwen team Thank you for releasing smaller models π" [X Link](https://x.com/UnslothAI/status/1974293141319790771) 2025-10-04T01:59Z 32.7K followers, [----] engagements "@AMD @OpenAI Congrats We're also very excited to enable local efficient fine-tuning and reinforcement learning for AMD GPUs very soon ππ¦₯" [X Link](https://x.com/UnslothAI/status/1975181261988864196) 2025-10-06T12:48Z 32.5K followers, [----] engagements "Thank you @dkundel from OpenAI and Barath from NVIDIA for the collab. π₯° Watch Dominik's full gpt-oss presentation: https://www.youtube.com/watchv=1HL2YHRj270 https://www.youtube.com/watchv=1HL2YHRj270" [X Link](https://x.com/UnslothAI/status/1976292344136958364) 2025-10-09T14:23Z 32.8K followers, [----] engagements "DeepSeek-R1 GGUF's are now on @HuggingFace Includes all Llama & Qwen distilled models + [--] to 8-bit quantized versions. How to run R1: DeepSeek-R1 Collection: https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5 https://unsloth.ai/blog/deepseek-r1 https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5 https://unsloth.ai/blog/deepseek-r1 https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5 https://unsloth.ai/blog/deepseek-r1" [X Link](https://x.com/UnslothAI/status/1881357596717891955) 2025-01-20T15:06Z 18.3K followers, 68.4K engagements "You can now reproduce DeepSeek-R1's reasoning on your own local device Experience the "Aha" moment with just 7GB VRAM. Unsloth reduces GRPO training memory use by 80%. 15GB VRAM can transform Llama-3.1 (8B) & Phi-4 (14B) into reasoning models. Blog: http://unsloth.ai/blog/r1-reasoning http://unsloth.ai/blog/r1-reasoning http://unsloth.ai/blog/r1-reasoning http://unsloth.ai/blog/r1-reasoning" [X Link](https://x.com/UnslothAI/status/1887562753126408210) 2025-02-06T18:03Z 24K followers, [----] engagements "You can now fine-tune TTS models with Unsloth Train run and save models like Sesame-CSM and OpenAI's Whisper locally with our free notebooks. Unsloth makes TTS training 1.5x faster with 50% less VRAM. GitHub: Docs & Notebooks: https://docs.unsloth.ai/basics/text-to-speech-tts-fine-tuning https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/text-to-speech-tts-fine-tuning https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/text-to-speech-tts-fine-tuning https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/text-to-speech-tts-fine-tuning" [X Link](https://x.com/UnslothAI/status/1923055371008213086) 2025-05-15T16:38Z 33.8K followers, 127.3K engagements "We made a repo with 100+ Fine-tuning notebooks all in once place Has guides & examples for: Tool-calling Classification Synthetic data BERT TTS Vision LLMs GRPO DPO SFT CPT Dataprep eval saving Llama Qwen Gemma Phi DeepSeek https://github.com/unslothai/notebooks/ https://github.com/unslothai/notebooks/ https://github.com/unslothai/notebooks/ https://github.com/unslothai/notebooks/" [X Link](https://x.com/UnslothAI/status/1930255555416994126) 2025-06-04T13:29Z 36.7K followers, 84.7K engagements "We made a complete Guide on Reinforcement Learning for LLMs Learn about: RL's goal & why it's key to building intelligent AI agents Why o3 Claude [--] & R1 use RL GRPO RLHF DPO reward functions Training your own local R1 model via Unsloth https://docs.unsloth.ai/basics/reinforcement-learning-guide https://docs.unsloth.ai/basics/reinforcement-learning-guide" [X Link](https://x.com/UnslothAI/status/1934983471912591612) 2025-06-17T14:36Z 39.7K followers, 70.5K engagements "You can now run the worlds most powerful Western open models locally The hybrid reasoning 671B model matches o3 & Claude-4-Opus in performance. Trained on Llama [--] & DeepSeek-R1 Cogito-v2 has [--] variantseach setting new benchmarks. Guide + GGUFs: https://docs.unsloth.ai/basics/tutorials-how-to-fine-tune-and-run-llms/cogito-v2-how-to-run-locally Today we are releasing [--] hybrid reasoning models of sizes 70B 109B MoE 405B 671B MoE under open license. These are some of the strongest LLMs in the world and serve as a proof of concept for a novel AI paradigm - iterative self-improvement (AI systems" [X Link](https://x.com/UnslothAI/status/1951082845344276493) 2025-08-01T00:49Z 33.3K followers, 40.7K engagements "@OpenAI Amazing guys Super excited to support them so y'all can run & fine-tune them locally π€©" [X Link](https://x.com/UnslothAI/status/1952778078935273591) 2025-08-05T17:05Z 36.6K followers, 27.6K engagements "You can now run gpt-oss-120b & 20b locally with our GGUFs π¦₯ Run OpenAI's 120b model on 66GB RAM & 20b model on 14GB RAM. Both in original precision. Uploads includes our chat template fixes. Guide: GGUF: https://huggingface.co/unsloth/gpt-oss-20b-GGUF https://docs.unsloth.ai/basics/gpt-oss https://huggingface.co/unsloth/gpt-oss-20b-GGUF https://docs.unsloth.ai/basics/gpt-oss Our open models are here. Both of them. https://t.co/9tFxefOXcg https://huggingface.co/unsloth/gpt-oss-20b-GGUF https://docs.unsloth.ai/basics/gpt-oss https://huggingface.co/unsloth/gpt-oss-20b-GGUF" [X Link](https://x.com/UnslothAI/status/1952824564897210800) 2025-08-05T20:10Z 38.9K followers, 95.8K engagements "Google releases Gemma [--] 270M a new model that runs locally on just [---] GB RAM.β¨ Trained on 6T tokens it runs fast on phones & handles chat coding & math. Run at [--] t/s with our Dynamic GGUF or fine-tune via Unsloth & export to your phone. Details: https://docs.unsloth.ai/basics/gemma-3-how-to-run-and-fine-tune https://docs.unsloth.ai/basics/gemma-3-how-to-run-and-fine-tune Introducing Gemma [--] 270M π₯ π€A tiny model Just [---] million parameters π§ Very strong instruction following π€ Fine-tune in just a few minutes with a large vocabulary to serve as a high-quality foundation" [X Link](https://x.com/UnslothAI/status/1956027720288366883) 2025-08-14T16:18Z 33.8K followers, 156.6K engagements "Can a 1-bit or 3-bit quantized model outperform GPT-4.1 or Claude-Opus-4 Yes Today we're excited to show how LLMs like DeepSeek-V3.1 can be quantized to just 1-bit or 3-bit and still beat SOTA models like Claude-Opus-4 (thinking) on Aider Polyglot. Details and blog below" [X Link](https://x.com/UnslothAI/status/1965797776387879378) 2025-09-10T15:21Z 38.1K followers, 165.4K engagements "We made a free notebook that fine-tunes IBM Granite [---] into a powerful support agent This agent will enable real-time analysis & solving of customer interactions. You'll also learn how to train models using data from Google Sheets. Colab Notebook: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Granite4.0.ipynb https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Granite4.0.ipynb https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Granite4.0.ipynb" [X Link](https://x.com/UnslothAI/status/1973774439344214426) 2025-10-02T15:37Z 33.4K followers, 50.6K engagements "OpenAI shows how gpt-oss can autonomously beat [----] using reinforcement learning (RL). Training was done locally with Unsloth on NVIDIA DGX Spark. You can also do it free on Colab. π¦₯ OpenAI DevDay notebook: https://github.com/openai/gpt-oss/blob/main/examples/reinforcement-fine-tuning.ipynb https://github.com/openai/gpt-oss/blob/main/examples/reinforcement-fine-tuning.ipynb https://github.com/openai/gpt-oss/blob/main/examples/reinforcement-fine-tuning.ipynb https://github.com/openai/gpt-oss/blob/main/examples/reinforcement-fine-tuning.ipynb" [X Link](https://x.com/UnslothAI/status/1976284209842118714) 2025-10-09T13:50Z 33.7K followers, 97.6K engagements "You can now train models up to 200B parameters locally on NVIDIA DGX Spark with Unsloth π¦₯ Fine-tune RL & deploy OpenAI gpt-oss-120b via our free notebook in 68GB unified memory: Read our step-by-step guide in collab with NVIDIA https://docs.unsloth.ai/new/fine-tuning-llms-with-nvidia-dgx-spark-and-unsloth https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(120B)_A100-Fine-tuning.ipynb https://docs.unsloth.ai/new/fine-tuning-llms-with-nvidia-dgx-spark-and-unsloth" [X Link](https://x.com/UnslothAI/status/1978456629613084926) 2025-10-15T13:43Z 33.8K followers, 46.4K engagements "You can now fine-tune Qwen3-VL (8B) for free with our notebook Unsloth trains VLMs 1.7x faster with 60% less VRAM and 8x longer context - no accuracy loss. GitHub: Qwen3-VL GRPO Colab: Qwen3-VL Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_VL_(8B)-Vision.ipynb https://docs.unsloth.ai/models/qwen3-vl-run-and-fine-tune#fine-tuning-qwen3-vl https://github.com/unslothai/unsloth https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_VL_(8B)-Vision.ipynb https://docs.unsloth.ai/models/qwen3-vl-run-and-fine-tune#fine-tuning-qwen3-vl" [X Link](https://x.com/UnslothAI/status/1978821090135687182) 2025-10-16T13:51Z 34.9K followers, 107.9K engagements "We just hit [---] million lifetime downloads on Hugging Face π¦₯π€ Huge thanks to all of you The amazing community model creators and HF team. π" [X Link](https://x.com/UnslothAI/status/1980631523104813419) 2025-10-21T13:45Z 33.4K followers, 31.7K engagements "You can now quantize LLMs to 4-bit and recover 70% accuracy via Quantization-Aware Training. We teamed up with @PyTorch to show how QAT enables: 4x less VRAM with no inference overhead 1-3% increase in raw accuracy (GPQA MMLU Pro) Notebook & Blog: https://docs.unsloth.ai/new/quantization-aware-training-qat https://docs.unsloth.ai/new/quantization-aware-training-qat https://docs.unsloth.ai/new/quantization-aware-training-qat https://docs.unsloth.ai/new/quantization-aware-training-qat" [X Link](https://x.com/UnslothAI/status/1981021761782317368) 2025-10-22T15:36Z 34K followers, 44.9K engagements "We showcased our one click fine-tuning UI for the first time at the NVIDIA x Mistral AI x Unsloth event at Y Combinator π₯π¦₯ Huge thanks to everyone who came π₯° π Thank you to everyone who joined us at AI Dev Night with @UnslothAI and @MistralAI. We're looking forward to meeting more of you at #PyTorchCon #OpenSourceAIWeek. https://t.co/xCJrGMrbZ4 π Thank you to everyone who joined us at AI Dev Night with @UnslothAI and @MistralAI. We're looking forward to meeting more of you at #PyTorchCon #OpenSourceAIWeek. https://t.co/xCJrGMrbZ4" [X Link](https://x.com/UnslothAI/status/1981449186290913787) 2025-10-23T19:54Z 38.5K followers, 14.4K engagements "We teamed up with @NVIDIA to teach you how to fine-tune LLMs on Blackwell & RTX [--] GPUs. Unsloth makes training on Blackwell up to [--] faster with 70% less VRAM - no accuracy loss. Learn how to use our new Docker image & more in the official NVIDIA Blog: https://developer.nvidia.com/blog/train-an-llm-on-an-nvidia-blackwell-desktop-with-unsloth-and-scale-it/ https://developer.nvidia.com/blog/train-an-llm-on-an-nvidia-blackwell-desktop-with-unsloth-and-scale-it/ https://developer.nvidia.com/blog/train-an-llm-on-an-nvidia-blackwell-desktop-with-unsloth-and-scale-it/" [X Link](https://x.com/UnslothAI/status/1982810257845035280) 2025-10-27T14:02Z 34K followers, 35.2K engagements "You can now run Qwen3-VL locally π Run the 235B variant for SOTA vision/OCR on 128GB unified memory (dynamic 4-bit). Includes our chat template fixes. Qwen3-VL-2B runs at [--] t/s on 4GB RAM. Fine-tune & RL via Unsloth free notebooks & export to GGUF. https://docs.unsloth.ai/models/qwen3-vl https://docs.unsloth.ai/models/qwen3-vl https://docs.unsloth.ai/models/qwen3-vl https://docs.unsloth.ai/models/qwen3-vl" [X Link](https://x.com/UnslothAI/status/1984251877295550933) 2025-10-31T13:31Z 39.7K followers, 92.8K engagements "To run Qwen3-VL you can read our step-by-step tutorial and download the GGUFs from our Hugging Face collection: https://huggingface.co/collections/unsloth/qwen3-vl https://huggingface.co/collections/unsloth/qwen3-vl" [X Link](https://x.com/UnslothAI/status/1984280730692907073) 2025-10-31T15:26Z 34.1K followers, [----] engagements "@Alibaba_Qwen Thank you for the support ππ¦₯ Here's our free Colab notebooks for fine-tuning and reinforcement learning (RL) of Qwen3-VL-8B: https://x.com/UnslothAI/status/1978821090135687182 You can now fine-tune Qwen3-VL (8B) for free with our notebook Unsloth trains VLMs 1.7x faster with 60% less VRAM and 8x longer context - no accuracy loss. GitHub: https://t.co/aZWYAt9MMh Qwen3-VL GRPO Colab: https://t.co/HkjYydXDnR Qwen3-VL Colab: https://t.co/r3p2wgIzVS https://x.com/UnslothAI/status/1978821090135687182 You can now fine-tune Qwen3-VL (8B) for free with our notebook Unsloth trains VLMs" [X Link](https://x.com/UnslothAI/status/1984822375951777981) 2025-11-02T03:18Z 34K followers, [----] engagements "You can now fine-tune DeepSeek-OCR with our free notebook We fine-tuned DeepSeek-OCR improving its language understanding by 89% and reduced Character Error Rate from 149% to 60% Blog: GitHub: Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Deepseek_OCR_(3B)-Eval.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/new/deepseek-ocr https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Deepseek_OCR_(3B)-Eval.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/new/deepseek-ocr" [X Link](https://x.com/UnslothAI/status/1985728926556307471) 2025-11-04T15:20Z 34.4K followers, 80.6K engagements "@donvito Most models with up to 32B parameters (e.g. Qwen3-32B) can fine-tune locally with Unsloth on a 24GB VRAM GPU. π₯° LoRA or FFT will use much more VRAM though. You can find more details about this in our docs" [X Link](https://x.com/UnslothAI/status/1985749459666682123) 2025-11-04T16:42Z 34.1K followers, [----] engagements "You can now run Kimi K2 Thinking locally with our Dynamic 1-bit GGUFs We shrank the 1T model to 245GB (-62%) & retained 85% of accuracy. Run on 247GB RAM. We also worked with the Kimi team on a system prompt fix. Guide: GGUF: https://huggingface.co/unsloth/Kimi-K2-Thinking-GGUF https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally https://huggingface.co/unsloth/Kimi-K2-Thinking-GGUF https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally https://huggingface.co/unsloth/Kimi-K2-Thinking-GGUF https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally" [X Link](https://x.com/UnslothAI/status/1987184173782888683) 2025-11-08T15:43Z 36.7K followers, 178.2K engagements "You can also run Kimi K2 Thinking in full precision by using our 4-bit or 5-bit GGUFs since the original model was released as INT4. π This will require 520GB - 730GB RAM/VRAM for fast inference" [X Link](https://x.com/UnslothAI/status/1987206851554128337) 2025-11-08T17:13Z 36.7K followers, [----] engagements "You can now run Unsloth GGUFs locally via Docker Run LLMs on Mac or Windows with one line of code or no code at all We collabed with Docker to make Dynamic GGUFs available for everyone Just run: docker model run ai/gpt-oss:20B Guide: https://docs.unsloth.ai/models/how-to-run-llms-with-docker https://docs.unsloth.ai/models/how-to-run-llms-with-docker https://docs.unsloth.ai/models/how-to-run-llms-with-docker https://docs.unsloth.ai/models/how-to-run-llms-with-docker" [X Link](https://x.com/UnslothAI/status/1990428016296812595) 2025-11-17T14:33Z 35.3K followers, 93.2K engagements "We made a guide on how to deploy LLMs locally with SGLang In collab with @lmsysorg you'll learn to: Deploy fine-tuned LLMs for large scale production Serve GGUFs locally Benchmark inference speed Use on the fly FP8 for 1.6x inference Guide: https://docs.unsloth.ai/basics/inference-and-deployment/sglang-guide https://docs.unsloth.ai/basics/inference-and-deployment/sglang-guide" [X Link](https://x.com/UnslothAI/status/1991879337923211675) 2025-11-21T14:40Z 35.2K followers, 29K engagements "You can now run FP8 reinforcement learning on consumer GPUs Try DeepSeek-R1s FP8 GRPO at home using only a 5GB GPU. Qwen3-1.7B fits in 5GB VRAM. We collabed with PyTorch to make FP8 RL inference [---] faster. Unsloth: 60% less VRAM [--] longer context. https://docs.unsloth.ai/new/fp8-reinforcement-learning https://docs.unsloth.ai/new/fp8-reinforcement-learning https://docs.unsloth.ai/new/fp8-reinforcement-learning https://docs.unsloth.ai/new/fp8-reinforcement-learning" [X Link](https://x.com/UnslothAI/status/1993358367776186801) 2025-11-25T16:37Z 38K followers, 144.6K engagements "You can now do 500K context length fine-tuning with Unsloth Train gpt-oss-20b to extend its context window to 530K on 80GB VRAM & 750K+ on 192GB - no accuracy loss. Unsloth's new algorithms + Tiled MLP = 72% less VRAM & 6x more context Blog + Notebook: https://docs.unsloth.ai/new/500k-context-length-fine-tuning https://docs.unsloth.ai/new/500k-context-length-fine-tuning https://docs.unsloth.ai/new/500k-context-length-fine-tuning https://docs.unsloth.ai/new/500k-context-length-fine-tuning" [X Link](https://x.com/UnslothAI/status/1995504614440157409) 2025-12-01T14:45Z 37.7K followers, 41K engagements "To clarify yes this release supports any LLM or VLM not just gpt-oss - with limited RL support as well. :) More details in our blogpost" [X Link](https://x.com/UnslothAI/status/1995515160950378733) 2025-12-01T15:27Z 35.2K followers, [----] engagements "Mistral releases Ministral [--] their new reasoning and instruct models π₯ Ministral [--] comes in 3B 8B and 14B with vision support and best-in-class performance. Run the 14B models locally with 24GB RAM. Guide + Notebook: GGUFs: https://huggingface.co/collections/unsloth/ministral-3 https://docs.unsloth.ai/new/ministral-3 Introducing the Mistral [--] family of models: Frontier intelligence at all sizes. Apache [---]. Details in π§΅ https://t.co/lsrDmhW78u https://huggingface.co/collections/unsloth/ministral-3 https://docs.unsloth.ai/new/ministral-3 Introducing the Mistral [--] family of models: Frontier" [X Link](https://x.com/UnslothAI/status/1995874975631503479) 2025-12-02T15:17Z 40.3K followers, 81.6K engagements "@Alibaba_Qwen Let's gooo Qwen & open-source ππ¦₯" [X Link](https://x.com/UnslothAI/status/1996498782796792043) 2025-12-04T08:36Z 35.3K followers, [---] engagements "You can now train Mistral Ministral [--] with reinforcement learning in our free notebook You'll GRPO the model to solve sudoku autonomously. Learn about our new reward functions RL environment & reward hacking. Blog: Notebook: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Ministral_3_(3B)_Reinforcement_Learning_Sudoku_Game.ipynb https://docs.unsloth.ai/new/ministral-3 https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Ministral_3_(3B)_Reinforcement_Learning_Sudoku_Game.ipynb https://docs.unsloth.ai/new/ministral-3" [X Link](https://x.com/UnslothAI/status/1996595704438120774) 2025-12-04T15:01Z 37.7K followers, 41K engagements "NVIDIA releases Nemotron [--] Nano a new 30B hybrid reasoning model π₯ Nemotron [--] has a 1M context window and the best in class performance for SWE-Bench reasoning and chat. Run the MoE model locally with 24GB RAM. Guide: GGUF: https://huggingface.co/unsloth/Nemotron-3-Nano-30B-A3B-GGUF https://docs.unsloth.ai/models/nemotron-3 https://huggingface.co/unsloth/Nemotron-3-Nano-30B-A3B-GGUF https://docs.unsloth.ai/models/nemotron-3" [X Link](https://x.com/UnslothAI/status/2000568378407452746) 2025-12-15T14:07Z 38.5K followers, 138.3K engagements "We teamed up with @NVIDIA and @MatthewBerman to teach you how to do Reinforcement Learning Learn about: - RL environments reward functions & reward hacking - Training OpenAI gpt-oss to automatically solve [----] - Local Windows training with @NVIDIA_AI_PC RTX GPUs - How RLVR (verifiable rewards) works - How to interpret RL metrics like KL Divergence Full video tutorial: https://www.youtube.com/watchv=9t-BAjzBWj8 https://www.youtube.com/watchv=9t-BAjzBWj8" [X Link](https://x.com/UnslothAI/status/2000936703830134977) 2025-12-16T14:31Z 38.5K followers, 51.9K engagements "Google releases FunctionGemma a new 270M parameter model that runs on just [---] GB RAM.β¨ Built for tool-calling run locally on your phone at 50+ tokens/s or fine-tune with Unsloth & deploy to your phone. Docs + Notebook: GGUF: https://huggingface.co/unsloth/functiongemma-270m-it-GGUF https://docs.unsloth.ai/models/functiongemma https://huggingface.co/unsloth/functiongemma-270m-it-GGUF https://docs.unsloth.ai/models/functiongemma Introducing FunctionGemma π€270m model for function calling π±can run in your phone browser or other devices π€designed to be specialized for your own tasks" [X Link](https://x.com/UnslothAI/status/2001704687880606104) 2025-12-18T17:22Z 38.2K followers, 219.4K engagements "@Alibaba_Qwen Congrats guys this is an amazing open-source effort ππ₯° We made Qwen-Image-Edit-2511 GGUFs so everyone can run it locally π https://huggingface.co/unsloth/Qwen-Image-Edit-2511-GGUF https://huggingface.co/unsloth/Qwen-Image-Edit-2511-GGUF" [X Link](https://x.com/UnslothAI/status/2003500266340123127) 2025-12-23T16:17Z 38.5K followers, 32.5K engagements "@NVIDIAAIDev Thanks guys for the constant support π¦₯π" [X Link](https://x.com/UnslothAI/status/2003609001561641046) 2025-12-23T23:29Z 38.1K followers, [---] engagements "Merry Christmas from Unsloth ππ Thank you for all the support this year Were excited to keep shipping open-source next year π₯°" [X Link](https://x.com/UnslothAI/status/2003841415441490265) 2025-12-24T14:53Z 38.5K followers, [----] engagements "We just crossed [-----] stars on GitHub π¦₯ Huge thanks to you every contributor and our amazing community for all your support. Our GitHub repo: https://github.com/unslothai/unsloth https://github.com/unslothai/unsloth" [X Link](https://x.com/UnslothAI/status/2006010458520568225) 2025-12-30T14:32Z 38.6K followers, 19.9K engagements "@Alibaba_Qwen [----] was so amazing because of Qwen We're super excited for Qwen4 in [----] ππ₯°" [X Link](https://x.com/UnslothAI/status/2006225629356957812) 2025-12-31T04:47Z 38.6K followers, [----] engagements "@Alibaba_Qwen Thanks guys for the support and day zero access We're excited for more Qwen in [----] ππΈ" [X Link](https://x.com/UnslothAI/status/2006307602498818178) 2025-12-31T10:13Z 38.6K followers, [----] engagements "We made a guide on how to run Qwen-Image diffusion models locally Learn to: Run Qwen-Image-2512 and Edit-2511 Use GGUF FP8 in ComfyUI stable-diffusion.cpp diffusers Create workflows & prompts Adjust hyperparams (sampling guidance) Guide: https://unsloth.ai/docs/models/qwen-image-2512 https://unsloth.ai/docs/models/qwen-image-2512 https://unsloth.ai/docs/models/qwen-image-2512 https://unsloth.ai/docs/models/qwen-image-2512" [X Link](https://x.com/UnslothAI/status/2009273362325913980) 2026-01-08T14:37Z 39.7K followers, 30.8K engagements "@Zai_org Thank you guys for this amazing release You can now run & fine-tune the model locally: https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF" [X Link](https://x.com/UnslothAI/status/2013480083739288035) 2026-01-20T05:13Z 39.8K followers, [----] engagements "Update: For improved performance please use: --dry-multiplier [---] --temp [---] --top-k [--] --top-p [----] --min-p [----] which should reduce any looping or incorrect output issues. π --dry-multiplier [---] especially works well. For more information see: https://unsloth.ai/docs/models/glm-4.7-flash#reducing-repetition-and-looping https://unsloth.ai/docs/models/glm-4.7-flash#reducing-repetition-and-looping https://unsloth.ai/docs/models/glm-4.7-flash#reducing-repetition-and-looping https://unsloth.ai/docs/models/glm-4.7-flash#reducing-repetition-and-looping" [X Link](https://x.com/UnslothAI/status/2013513461091999838) 2026-01-20T07:26Z 39.3K followers, [----] engagements "@Zai_org Congrats guys on release & thank you for supporting open-source π π₯° We uploaded GLM-5 GGUFs so people can run it locally: https://huggingface.co/unsloth/GLM-5-GGUF https://huggingface.co/unsloth/GLM-5-GGUF" [X Link](https://x.com/UnslothAI/status/2021665541203861995) 2026-02-11T19:20Z 43.5K followers, 16.6K engagements "You can now run GLM-5 locallyπ₯ GLM-5 is a new open SOTA agentic coding & chat LLM with 200K context. We shrank the 744B model from 1.65TB to 241GB (-85%) via Dynamic 2-bit. Runs on a 256GB Mac or RAM/VRAM setups. Guide: GGUF: https://huggingface.co/unsloth/GLM-5-GGUF https://unsloth.ai/docs/models/glm-5 Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5 it scales from 355B params (32B active) to 744B (40B active) with pre-training data growing from 23T to 28.5T tokens." [X Link](https://x.com/anyuser/status/2021931246247690666) 2026-02-12T12:55Z 43.5K followers, 225.2K engagements "Run DeepSeek-V3.1 locally on 170GB RAM with Dynamic 1-bit GGUFsπ The 715GB model gets reduced to 170GB (-80% size) by smartly quantizing layers. The 1-bit GGUF passes all our code tests & we fixed the chat template Guide: GGUF: https://huggingface.co/unsloth/DeepSeek-V3.1-GGUF https://docs.unsloth.ai/basics/deepseek-v3.1 https://huggingface.co/unsloth/DeepSeek-V3.1-GGUF https://docs.unsloth.ai/basics/deepseek-v3.1" [X Link](https://x.com/UnslothAI/status/1958980042048065555) 2025-08-22T19:50Z 43.3K followers, 60.3K engagements "OpenAI gpt-oss with ultra long context is hereπ Introducing Unsloth Flex Attention which enables 61K context for gpt-oss bf16 training on a 80GB GPU. Unsloth achieves 8longer context 50% less VRAM & 1.5faster training vs. all implementations. https://docs.unsloth.ai/basics/long-context-gpt-oss-training https://docs.unsloth.ai/basics/long-context-gpt-oss-training" [X Link](https://x.com/UnslothAI/status/1961108732361994248) 2025-08-28T16:48Z 43.3K followers, 142.4K engagements "Unsloth now has a Docker image π³ Train LLMs locally with no setup: just run the image and go. Includes every pre-made Unsloth notebook. Solves dependency or environment issues. Guide: https://docs.unsloth.ai/new/how-to-train-llms-with-unsloth-and-docker https://docs.unsloth.ai/new/how-to-train-llms-with-unsloth-and-docker" [X Link](https://x.com/UnslothAI/status/1973383044225536312) 2025-10-01T13:42Z 43.2K followers, 97.1K engagements "You can now fine-tune LLMs and deploy them directly on your phone π We collabed with PyTorch so you can export and run your trained model 100% locally on your iOS or Android device. Deploy Qwen3 on Pixel [--] and iPhone [--] Pro at [--] tokens/sec. Guide: https://docs.unsloth.ai/new/deploy-llms-phone https://docs.unsloth.ai/new/deploy-llms-phone https://docs.unsloth.ai/new/deploy-llms-phone https://docs.unsloth.ai/new/deploy-llms-phone" [X Link](https://x.com/UnslothAI/status/2001305185206091917) 2025-12-17T14:55Z 41.6K followers, 136.6K engagements "NVIDIA made a beginner's guide to fine-tuning LLMs with Unsloth π You'll learn about: - Training methods: LoRA FFT RL - When to fine-tune and why + use-cases - Amount of data and VRAM needed - How to train locally on DGX Spark RTX GPUs & more Guide: https://blogs.nvidia.com/blog/rtx-ai-garage-fine-tuning-unsloth-dgx-spark/ https://blogs.nvidia.com/blog/rtx-ai-garage-fine-tuning-unsloth-dgx-spark/" [X Link](https://x.com/UnslothAI/status/2003098731852488864) 2025-12-22T13:42Z 42.3K followers, 140K engagements "You can now fine-tune LLMs with Unsloth then deploy them in @LMStudio π¦₯πΎ We made a free notebook to fine-tune FunctionGemma (270M) so it thinks before calling tools then export the model to GGUF for deployment in LM Studio. Notebook: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/FunctionGemma_(270M)-LMStudio.ipynb We worked with @UnslothAI on a new beginner's guide: How to fine-tune FunctionGemma and run it locally π§ Train FunctionGemma for custom tool calls β¨ Convert it to GGUF + import into LM Studio πΎ Serve it locally and use it in your code Step-by-step" [X Link](https://x.com/UnslothAI/status/2003493564878393370) 2025-12-23T15:51Z 41.7K followers, 60.7K engagements "Qwen releases Qwen-Image-2512 a new SOTA text-to-image model. π It's the top performing open diffusion model on AI Arena and has more realistic + accurate images/text. Run locally with 14GB RAM via our Dynamic GGUF Guide: GGUF: https://huggingface.co/unsloth/Qwen-Image-2512-GGUF https://unsloth.ai/docs/models/qwen-image-2512 πANewYeargiftfromQwenQwen-Image-2512ishere. πOurDecemberupgradetoQwen-ImagejustintimefortheNewYear. β¨Whatsnew: MorerealistichumansdramaticallyreducedAIlookricherfacialdetails Finernaturaltexturessharperlandscapeswater https://t.co/8X6AVcJCIG" [X Link](https://x.com/UnslothAI/status/2006297912557633586) 2025-12-31T09:34Z 43.2K followers, 120.2K engagements "You can now do reinforcement learning training with [--] longer context and no accuracy loss via our new batching algorithms. Long reasoning chains in RL are costly but now we enable you to train gpt-oss with GRPO & reach 380K context on a 192GB GPU. https://unsloth.ai/docs/new/grpo-long-context https://unsloth.ai/docs/new/grpo-long-context https://unsloth.ai/docs/new/grpo-long-context https://unsloth.ai/docs/new/grpo-long-context" [X Link](https://x.com/UnslothAI/status/2011827592886960131) 2026-01-15T15:47Z 41.6K followers, 71.9K engagements "You can now fine-tune embedding models in our free notebook Improve retrieval and RAG with better semantic search & similarity. Unsloth trains 2x faster 20% less VRAM 2x context & no accuracy loss Blog: EmbeddingGemma (300M): https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/EmbeddingGemma_(300M).ipynb https://unsloth.ai/docs/new/embedding-finetuning https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/EmbeddingGemma_(300M).ipynb https://unsloth.ai/docs/new/embedding-finetuning" [X Link](https://x.com/anyuser/status/2014369691117170880) 2026-01-22T16:08Z 43.5K followers, 81.5K engagements "Unsloth is excited to support @HuggingFace Transformers v5 π€π¦₯ Get all the latest performance improvements in inference training and more Transformers v5's FINAL stable release is out π₯ Transformers' biggest release. The big Ws of this release: - Performance especially for MoE (6x-11x speedups) - No more slow/fast tokenizers - way simpler API explicit backends better performance - dynamic weight loading: way https://t.co/PV9lmE3KJx Transformers v5's FINAL stable release is out π₯ Transformers' biggest release. The big Ws of this release: - Performance especially for MoE (6x-11x speedups) -" [X Link](https://x.com/anyuser/status/2015935368525447395) 2026-01-26T23:50Z 43.5K followers, 21.6K engagements "DeepSeek releases DeepSeek-OCR [--]. π The new 3B model achieves SOTA visual document and OCR understanding. DeepEncoder V2 is introduced which enables the model scan images in same logical order as humans boosting OCR accuracy. Instead of traditional vision LLMs which read an image in a fixed grid (top-left bottom-right) DeepEncoder V2 first builds a global understanding then learns a human-like reading order - what to attend to first next and so on. This improves OCR on complex layouts helping it follow columns link labels to values read tables coherently and handle mixed text + structure" [X Link](https://x.com/anyuser/status/2016030864304701561) 2026-01-27T06:09Z 43.5K followers, 222.8K engagements "@Kimi_Moonshot Congrats guys & thank you for this amazing open release π We're working on Dynamic GGUFs so you guys can run it locally: https://huggingface.co/unsloth/Kimi-K2.5-GGUF https://huggingface.co/unsloth/Kimi-K2.5-GGUF" [X Link](https://x.com/UnslothAI/status/2016048653711114608) 2026-01-27T07:20Z 43.3K followers, 101.9K engagements "For tutorials on how to Run & Fine-tune DeepSeek-OCR [--] you can read our guide: Inference & training for the model is already supported in Unsloth. https://unsloth.ai/docs/models/deepseek-ocr-2 https://unsloth.ai/docs/models/deepseek-ocr-2" [X Link](https://x.com/anyuser/status/2016076972976214494) 2026-01-27T09:13Z 43.5K followers, [----] engagements "Note that VRAM is not required. You can run on a Mac with 256GB unified memory with similar speeds or [---] RAM without VRAM. You can even run with much less compute (e.g. 80GB RAM) as it'll offload but it'll be slower. https://twitter.com/i/web/status/2016532064955191619 https://twitter.com/i/web/status/2016532064955191619" [X Link](https://x.com/anyuser/status/2016532064955191619) 2026-01-28T15:21Z 43.5K followers, 16.5K engagements "@Alibaba_Qwen Thank you so much for releasing an open-source LLM for fast and smart coding π₯° We made GGUFs so you can run Qwen3-Coder-Next locally on 46GB RAM or less: https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF" [X Link](https://x.com/UnslothAI/status/2018721602754789563) 2026-02-03T16:21Z 43.4K followers, 23.4K engagements "@Alibaba_Qwen We're super excited for more Qwen models this year ππ₯° Let's go open-source" [X Link](https://x.com/UnslothAI/status/2018735454565396680) 2026-02-03T17:16Z 43.3K followers, [----] engagements "@NVIDIAAIDev @huggingface Congrats guys thank you Nvidia team for releasing brilliant open-source models ππ" [X Link](https://x.com/UnslothAI/status/2018923476636111280) 2026-02-04T05:44Z 43.3K followers, [----] engagements "We made a guide on how to do tool calling with local LLMs. Learn how to use open models like Qwen3-Coder-Next and GLM-4.7-Flash for function calling. Has hands-on examples for: story writing Python execution terminal tool calls maths and more. Guide: https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms" [X Link](https://x.com/UnslothAI/status/2019440238272344418) 2026-02-05T15:57Z 42.2K followers, [---] engagements "@Zai_org Congrats guys GLM-4.7-Flash is actually one of the most popular models we've ever seen π₯π https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF" [X Link](https://x.com/UnslothAI/status/2021209874378461406) 2026-02-10T13:09Z 43.3K followers, [----] engagements "GLM-4.7-Flash GGUFs now produce significantly better outputs after recent llama.cpp bug fixes. We reconverted and updated the GGUFs. Run 4-bit locally on 18GB RAM. To get fixes re-download & use inference parameters by @Zai_org. Updated GGUFs: https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF You can now run GLM-4.7-Flash locally on your deviceπ₯ GLM-4.7-Flash is the best performing 30B model on SWE-Bench and GPQA. With 200K context it excels at coding agents chat & reasoning. Run local with 24GB RAM. Guide: https://t.co/SpJxl00VIa GGUF: https://t.co/aTuUxu32z3 https://t.co/3MwNRe3iva" [X Link](https://x.com/UnslothAI/status/2013966866646180345) 2026-01-21T13:28Z 43.5K followers, 153K engagements "You can now train LLMs [--] faster with no accuracy loss via our new RoPE and MLP kernels. Our Triton kernels plus smart auto packing delivers [--] faster training & 30% less VRAM vs optimized FA3 setups. Train Qwen3-4B 3x faster on just 3.9GB VRAM. Blog: https://docs.unsloth.ai/new/3x-faster-training-packing https://docs.unsloth.ai/new/3x-faster-training-packing" [X Link](https://x.com/anyuser/status/1998765021170696664) 2025-12-10T14:41Z 43.5K followers, 627.7K engagements "You can now run GLM-4.7-Flash locally on your deviceπ₯ GLM-4.7-Flash is the best performing 30B model on SWE-Bench and GPQA. With 200K context it excels at coding agents chat & reasoning. Run local with 24GB RAM. Guide: GGUF: https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF https://unsloth.ai/docs/models/glm-4.7-flash Introducing GLM-4.7-Flash: Your local coding and agentic assistant. Setting a new standard for the 30B class GLM-4.7-Flash balances high performance with efficiency making it the perfect lightweight deployment option. Beyond coding it is also recommended for creative writing" [X Link](https://x.com/UnslothAI/status/2013482180564132092) 2026-01-20T05:22Z 43.5K followers, 335.5K engagements "You can now run Kimi K2.5 locally π₯ We shrank the 1T model to 240GB (-60%) via Dynamic 1-bit. Run at [--] tok/s on 240GB VRAM/RAM. 2-bit is recommended as it passes our code tests. Run near full precision on 622GB. Guide: GGUF: https://huggingface.co/unsloth/Kimi-K2.5-GGUF https://unsloth.ai/docs/models/kimi-k2.5 π₯ Meet Kimi K2.5 Open-Source Visual Agentic Intelligence. πΉ Global SOTA on Agentic Benchmarks: HLE full set (50.2%) BrowseComp (74.9%) πΉ Open-source SOTA on Vision and Coding: MMMU Pro (78.5%) VideoMMMU (86.6%) SWE-bench Verified (76.8%) πΉ Code with Taste: turn chats" [X Link](https://x.com/anyuser/status/2016511345311834293) 2026-01-28T13:59Z 43.5K followers, 464.4K engagements "We successfully trained an LLM without human intervention using Claude Code. We made a guide on how to do this with local LLMs via Claude Code and OpenAI Codex. Connect GLM-4.7-Flash to your server and start agentic coding locally Guide: https://unsloth.ai/docs/basics/claude-codex https://unsloth.ai/docs/basics/claude-codex" [X Link](https://x.com/anyuser/status/2016901669792210970) 2026-01-29T15:50Z 43.5K followers, 138.1K engagements "Qwen releases Qwen3-Coder-Next. π The new 80B MoE model excels at agentic coding & local use. With 256K context it delivers similar performance to models with 10-20 more active parameters. Run on 46GB RAM or less. Guide: GGUF: https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF https://unsloth.ai/docs/models/qwen3-coder-next π IntroducingQwen3-Coder-Next an open-weight LM built for coding agents & local development. Whats new: π€ Scaling agentic training:800K verifiable tasks + executable envs π EfficiencyPerformance Tradeoff: achieves strong results on SWE-Bench Pro with 80B total params" [X Link](https://x.com/anyuser/status/2018718997584474191) 2026-02-03T16:11Z 43.5K followers, 239.1K engagements "We created a tool-calling guide for local LLMs Learn how to use any open model like Qwen3-Coder-Next and GLM-4.7-Flash for function calling. We provide hands-on examples for: story writing Python execution terminal tool calls maths and more. Guide: https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms" [X Link](https://x.com/anyuser/status/2019442022155976895) 2026-02-05T16:04Z 43.5K followers, 46.7K engagements "You can now train MoE models [--] faster with 35% less VRAM via our new Triton kernels (no accuracy loss). Train gpt-oss locally on 12.8GB VRAM. In collab with @HuggingFace Unsloth trains DeepSeek Qwen3 GLM faster. Repo: Blog: https://unsloth.ai/docs/new/faster-moe https://github.com/unslothai/unsloth https://unsloth.ai/docs/new/faster-moe https://github.com/unslothai/unsloth" [X Link](https://x.com/anyuser/status/2021244131927023950) 2026-02-10T15:25Z 43.5K followers, 208.9K engagements "You can now run MiniMax-2.5 locally π At 230B parameters MiniMax-2.5 is the strongest LLM under 700B params delivering SOTA agentic coding & chat. Run Dynamic 3/4-bit on a 128GB Mac for [--] tokens/s. Guide: GGUF: https://huggingface.co/unsloth/MiniMax-M2.5-GGUF https://unsloth.ai/docs/models/minimax-2.5 Introducing M2.5 an open-source frontier model designed for real-world productivity. - SOTA performance at coding (SWE-Bench Verified 80.2%) search (BrowseComp 76.3%) agentic tool-calling (BFCL 76.8%) & office work. - Optimized for efficient execution 37% faster at complex" [X Link](https://x.com/anyuser/status/2023029952791322627) 2026-02-15T13:41Z 43.5K followers, 125.3K engagements "You can now train LLMs [--] faster with no accuracy loss via our new RoPE and MLP kernels. Our Triton kernels plus smart auto packing delivers [--] faster training & 30% less VRAM vs optimized FA3 setups. Train Qwen3-4B 3x faster on just 3.9GB VRAM. Blog: https://docs.unsloth.ai/new/3x-faster-training-packing https://docs.unsloth.ai/new/3x-faster-training-packing" [X Link](https://x.com/anyuser/status/1998765021170696664) 2025-12-10T14:41Z 43.5K followers, 627.7K engagements "You can now run MiniMax-2.5 locally π At 230B parameters MiniMax-2.5 is the strongest LLM under 700B params delivering SOTA agentic coding & chat. Run Dynamic 3/4-bit on a 128GB Mac for [--] tokens/s. Guide: GGUF: https://huggingface.co/unsloth/MiniMax-M2.5-GGUF https://unsloth.ai/docs/models/minimax-2.5 Introducing M2.5 an open-source frontier model designed for real-world productivity. - SOTA performance at coding (SWE-Bench Verified 80.2%) search (BrowseComp 76.3%) agentic tool-calling (BFCL 76.8%) & office work. - Optimized for efficient execution 37% faster at complex" [X Link](https://x.com/anyuser/status/2023029952791322627) 2026-02-15T13:41Z 43.5K followers, 125.3K engagements "Introducing M2.5 an open-source frontier model designed for real-world productivity. - SOTA performance at coding (SWE-Bench Verified 80.2%) search (BrowseComp 76.3%) agentic tool-calling (BFCL 76.8%) & office work. - Optimized for efficient execution 37% faster at complex tasks. - At $1 per hour with [---] tps infinite scaling of long-horizon agents now economically possible MiniMax Agent: API: CodingPlan: http://platform.minimax.io/subscribe/coding-plan http://platform.minimax.io http://agent.minimax.io http://platform.minimax.io/subscribe/coding-plan http://platform.minimax.io" [X Link](https://x.com/anyuser/status/2021980761210134808) 2026-02-12T16:12Z 61.5K followers, 5.1M engagements "You can now run GLM-5 locallyπ₯ GLM-5 is a new open SOTA agentic coding & chat LLM with 200K context. We shrank the 744B model from 1.65TB to 241GB (-85%) via Dynamic 2-bit. Runs on a 256GB Mac or RAM/VRAM setups. Guide: GGUF: https://huggingface.co/unsloth/GLM-5-GGUF https://unsloth.ai/docs/models/glm-5 Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5 it scales from 355B params (32B active) to 744B (40B active) with pre-training data growing from 23T to 28.5T tokens." [X Link](https://x.com/anyuser/status/2021931246247690666) 2026-02-12T12:55Z 43.5K followers, 225.2K engagements "Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5 it scales from 355B params (32B active) to 744B (40B active) with pre-training data growing from 23T to 28.5T tokens. Try it now: Weights: Tech Blog: OpenRouter (Previously Pony Alpha): Rolling out from Coding Plan Max users: http://z.ai/subscribe http://openrouter.ai/z-ai/glm-5 http://z.ai/blog/glm-5 http://huggingface.co/zai-org/GLM-5 http://chat.z.ai http://z.ai/subscribe http://openrouter.ai/z-ai/glm-5 http://z.ai/blog/glm-5" [X Link](https://x.com/anyuser/status/2021638634739527773) 2026-02-11T17:33Z 50.9K followers, 1.4M engagements "tuning open weight models on colab is hands down biggest educational unlock out there. It sets students hackers devs anyone off on a rabbit hole journey of tinkering with their own models. for me what @UnslothAI have done with moe training is both a technical and educational wonder. that we just need to soak in. you can take @OpenAIDevs gpt-oss-20b and fine tune it for free in a few hours. https://twitter.com/i/web/status/2021610578138054773 https://twitter.com/i/web/status/2021610578138054773" [X Link](https://x.com/anyuser/status/2021610578138054773) 2026-02-11T15:41Z [----] followers, [----] engagements "RT @NVIDIAAIDev: This is an incredible performance breakthrough from @UnslothAI. 12x faster fine-tuning 35% less VRAM all with no loss i" [X Link](https://x.com/anyuser/status/2021400063914934524) 2026-02-11T01:45Z 43.5K followers, [---] engagements "This is an incredible performance breakthrough from @UnslothAI. 12x faster fine-tuning 35% less VRAM all with no loss in accuracy enables fine-tuning of MoE models like gpt-oss-20b on just [--] GB of VRAM. You can now train MoE models [--] faster with 35% less VRAM via our new Triton kernels (no accuracy loss). Train gpt-oss locally on 12.8GB VRAM. In collab with @HuggingFace Unsloth trains DeepSeek Qwen3 GLM faster. Repo: https://t.co/aZWYAtakBP Blog: https://t.co/3wRiBxVJB6 https://t.co/MZke9gtISU You can now train MoE models [--] faster with 35% less VRAM via our new Triton kernels (no accuracy" [X Link](https://x.com/anyuser/status/2021398069523382737) 2026-02-11T01:37Z 91.8K followers, 127.7K engagements "You can now train MoE models [--] faster with 35% less VRAM via our new Triton kernels (no accuracy loss). Train gpt-oss locally on 12.8GB VRAM. In collab with @HuggingFace Unsloth trains DeepSeek Qwen3 GLM faster. Repo: Blog: https://unsloth.ai/docs/new/faster-moe https://github.com/unslothai/unsloth https://unsloth.ai/docs/new/faster-moe https://github.com/unslothai/unsloth" [X Link](https://x.com/anyuser/status/2021244131927023950) 2026-02-10T15:25Z 43.5K followers, 208.9K engagements "GLM-4.7-Flash-GGUF is now the most downloaded model on @UnslothAI" [X Link](https://x.com/anyuser/status/2021207517557051627) 2026-02-10T12:59Z 50.9K followers, 56.4K engagements "We created a tool-calling guide for local LLMs Learn how to use any open model like Qwen3-Coder-Next and GLM-4.7-Flash for function calling. We provide hands-on examples for: story writing Python execution terminal tool calls maths and more. Guide: https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms" [X Link](https://x.com/anyuser/status/2019442022155976895) 2026-02-05T16:04Z 43.5K followers, 46.7K engagements "RT @Alibaba_Qwen: πππππ Thanks for the support from day 0" [X Link](https://x.com/anyuser/status/2018731275167936857) 2026-02-03T17:00Z 43.5K followers, [--] engagements "πππππ Thanks for the support from day [--] Qwen releases Qwen3-Coder-Next. π The new 80B MoE model excels at agentic coding & local use. With 256K context it delivers similar performance to models with 10-20 more active parameters. Run on 46GB RAM or less. Guide: https://t.co/kFrY9qi5co GGUF: https://t.co/J6Eb8c1nKO https://t.co/nBeplo3cdG Qwen releases Qwen3-Coder-Next. π The new 80B MoE model excels at agentic coding & local use. With 256K context it delivers similar performance to models with 10-20 more active parameters. Run on 46GB RAM or less. Guide: https://t.co/kFrY9qi5co GGUF:" [X Link](https://x.com/anyuser/status/2018730714636996889) 2026-02-03T16:58Z 142.4K followers, 30.8K engagements "Qwen releases Qwen3-Coder-Next. π The new 80B MoE model excels at agentic coding & local use. With 256K context it delivers similar performance to models with 10-20 more active parameters. Run on 46GB RAM or less. Guide: GGUF: https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF https://unsloth.ai/docs/models/qwen3-coder-next π IntroducingQwen3-Coder-Next an open-weight LM built for coding agents & local development. Whats new: π€ Scaling agentic training:800K verifiable tasks + executable envs π EfficiencyPerformance Tradeoff: achieves strong results on SWE-Bench Pro with 80B total params" [X Link](https://x.com/anyuser/status/2018718997584474191) 2026-02-03T16:11Z 43.5K followers, 239.1K engagements "π IntroducingQwen3-Coder-Next an open-weight LM built for coding agents & local development. Whats new: π€ Scaling agentic training:800K verifiable tasks + executable envs π EfficiencyPerformance Tradeoff: achieves strong results on SWE-Bench Pro with 80B total params and 3B active β¨SupportsOpenClaw Qwen Code Claude Code web dev browser use Cline etc π€ Hugging Face: π€ ModelScope: π Blog: π Tech https://github.com/QwenLM/Qwen3-Coder/blob/main/qwen3_coder_next_tech_report.pdf https://qwen.ai/blogid=qwen3-coder-next https://modelscope.cn/collections/Qwen/Qwen3-Coder-Next" [X Link](https://x.com/anyuser/status/2018718453570707465) 2026-02-03T16:09Z 142.4K followers, 1.5M engagements "We successfully trained an LLM without human intervention using Claude Code. We made a guide on how to do this with local LLMs via Claude Code and OpenAI Codex. Connect GLM-4.7-Flash to your server and start agentic coding locally Guide: https://unsloth.ai/docs/basics/claude-codex https://unsloth.ai/docs/basics/claude-codex" [X Link](https://x.com/anyuser/status/2016901669792210970) 2026-01-29T15:50Z 43.5K followers, 138.1K engagements "You can now run Kimi K2.5 locally π₯ We shrank the 1T model to 240GB (-60%) via Dynamic 1-bit. Run at [--] tok/s on 240GB VRAM/RAM. 2-bit is recommended as it passes our code tests. Run near full precision on 622GB. Guide: GGUF: https://huggingface.co/unsloth/Kimi-K2.5-GGUF https://unsloth.ai/docs/models/kimi-k2.5 π₯ Meet Kimi K2.5 Open-Source Visual Agentic Intelligence. πΉ Global SOTA on Agentic Benchmarks: HLE full set (50.2%) BrowseComp (74.9%) πΉ Open-source SOTA on Vision and Coding: MMMU Pro (78.5%) VideoMMMU (86.6%) SWE-bench Verified (76.8%) πΉ Code with Taste: turn chats" [X Link](https://x.com/anyuser/status/2016511345311834293) 2026-01-28T13:59Z 43.5K followers, 464.4K engagements "π₯ Meet Kimi K2.5 Open-Source Visual Agentic Intelligence. πΉ Global SOTA on Agentic Benchmarks: HLE full set (50.2%) BrowseComp (74.9%) πΉ Open-source SOTA on Vision and Coding: MMMU Pro (78.5%) VideoMMMU (86.6%) SWE-bench Verified (76.8%) πΉ Code with Taste: turn chats images & videos into aesthetic websites with expressive motion. πΉ Agent Swarm (Beta): self-directed agents working in parallel at scale. Up to [---] sub-agents [----] tool calls [---] faster compared with single-agent setup. - π₯ K2.5 is now live on in chat mode and agent mode. π₯ K2.5 Agent Swarm in beta for high-tier users. π₯" [X Link](https://x.com/anyuser/status/2016024049869324599) 2026-01-27T05:42Z 112.1K followers, 7.2M engagements "Note that VRAM is not required. You can run on a Mac with 256GB unified memory with similar speeds or [---] RAM without VRAM. You can even run with much less compute (e.g. 80GB RAM) as it'll offload but it'll be slower. https://twitter.com/i/web/status/2016532064955191619 https://twitter.com/i/web/status/2016532064955191619" [X Link](https://x.com/anyuser/status/2016532064955191619) 2026-01-28T15:21Z 43.5K followers, 16.5K engagements "DeepSeek releases DeepSeek-OCR [--]. π The new 3B model achieves SOTA visual document and OCR understanding. DeepEncoder V2 is introduced which enables the model scan images in same logical order as humans boosting OCR accuracy. Instead of traditional vision LLMs which read an image in a fixed grid (top-left bottom-right) DeepEncoder V2 first builds a global understanding then learns a human-like reading order - what to attend to first next and so on. This improves OCR on complex layouts helping it follow columns link labels to values read tables coherently and handle mixed text + structure" [X Link](https://x.com/anyuser/status/2016030864304701561) 2026-01-27T06:09Z 43.5K followers, 222.8K engagements "For tutorials on how to Run & Fine-tune DeepSeek-OCR [--] you can read our guide: Inference & training for the model is already supported in Unsloth. https://unsloth.ai/docs/models/deepseek-ocr-2 https://unsloth.ai/docs/models/deepseek-ocr-2" [X Link](https://x.com/anyuser/status/2016076972976214494) 2026-01-27T09:13Z 43.5K followers, [----] engagements "Unsloth is excited to support @HuggingFace Transformers v5 π€π¦₯ Get all the latest performance improvements in inference training and more Transformers v5's FINAL stable release is out π₯ Transformers' biggest release. The big Ws of this release: - Performance especially for MoE (6x-11x speedups) - No more slow/fast tokenizers - way simpler API explicit backends better performance - dynamic weight loading: way https://t.co/PV9lmE3KJx Transformers v5's FINAL stable release is out π₯ Transformers' biggest release. The big Ws of this release: - Performance especially for MoE (6x-11x speedups) -" [X Link](https://x.com/anyuser/status/2015935368525447395) 2026-01-26T23:50Z 43.5K followers, 21.6K engagements "Transformers v5's FINAL stable release is out π₯ Transformers' biggest release. The big Ws of this release: - Performance especially for MoE (6x-11x speedups) - No more slow/fast tokenizers - way simpler API explicit backends better performance - dynamic weight loading: way faster and enabling: MoE now working w/ quants tp peft . We have a migration guide on the main branch; please take a look at it in case you run into issues. Come in our GH issues if you still do after reading it π https://twitter.com/i/web/status/2015802366730395764 https://twitter.com/i/web/status/2015802366730395764" [X Link](https://x.com/anyuser/status/2015802366730395764) 2026-01-26T15:01Z 10.7K followers, 74.4K engagements "Sentence Transformers π€ @UnslothAI We've collaborated with the fine folks at @UnslothAI to make your embedding model finetuning 2x faster and require 20% less VRAM The Unsloth team prepared [--] notebooks showing how you can take advantage of it π§΅" [X Link](https://x.com/anyuser/status/2014391185616367618) 2026-01-22T17:34Z [----] followers, 15K engagements "You can now fine-tune embedding models in our free notebook Improve retrieval and RAG with better semantic search & similarity. Unsloth trains 2x faster 20% less VRAM 2x context & no accuracy loss Blog: EmbeddingGemma (300M): https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/EmbeddingGemma_(300M).ipynb https://unsloth.ai/docs/new/embedding-finetuning https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/EmbeddingGemma_(300M).ipynb https://unsloth.ai/docs/new/embedding-finetuning" [X Link](https://x.com/anyuser/status/2014369691117170880) 2026-01-22T16:08Z 43.5K followers, 81.5K engagements Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing
@UnslothAI Unsloth AIUnsloth AI posts on X about ai, accuracy, vram, agentic the most. They currently have [------] followers and [---] posts still getting attention that total [-------] engagements in the last [--] hours.
Social category influence technology brands stocks vc firms finance automotive brands celebrities products
Social topic influence ai #1763, accuracy, vram #35, agentic #9, open ai, how to, gpu, inference #402, faster, llm #10
Top accounts mentioned or mentioned by @huggingface @alibabaqwen @danielhanchen @grok @nvidia @deepseekai @zaiorg @amdindia @vipulgupta2048 @foley2k2 @scheminglunatic @mistralai @ycombinator @openai @pytorch @nvidiaaidev @emmanuel_mr18 @agentcommunity_ @rohanpaulai @kaggle
Top assets mentioned Alphabet Inc Class A (GOOGL) DeepSeek (DEEPSEEK) IBM (IBM) Gains (GAINS) FilesCoins Power Cu (FILECOIN) DeepSeek AI Agent (DEEPSEEKAI) Microsoft Corp. (MSFT) Flex Ltd. Ordinary Shares (FLEX)
Top posts by engagements in the last [--] hours
"Unsloth now supports fine-tuning of LLMs with 4x longer context windows We managed to reduce memory usage by a further 30% at the cost of +1.9% extra time overhead. Read our blog: http://unsloth.ai/blog/long-context http://unsloth.ai/blog/long-context"
X Link 2024-04-09T16:30Z [----] followers, [----] engagements
"This works on all model architectures which use gradient checkpointing (ie stable diffusion Mamba etc) See bar graph for memory saving benchmarks:"
X Link 2024-04-09T16:35Z [----] followers, [---] engagements
"Long-context Llama [--] finetuning is here π¦ Unsloth supports 48K context lengths for Llama-3 70b on a 80GB GPU - 6x longer than HF+FA2 QLoRA finetuning Llama-3 70b is 1.8x faster uses 68% less VRAM & Llama-3 8b is 2x faster and fits in a 8GB GPU Blog: https://www.unsloth.ai/blog/llama3 https://www.unsloth.ai/blog/llama3"
X Link 2024-04-24T18:24Z [----] followers, 59.4K engagements
"Mistral's new model NeMo (12B) is now supported Unsloth makes finetuning NeMo fit in a 12GB GPU QLoRA training is 2x faster uses 60% less memory & we support 3-4x longer context lengths than HF+FA2. Read our Blog: https://unsloth.ai/blog/mistral-nemo https://unsloth.ai/blog/mistral-nemo"
X Link 2024-07-19T15:48Z [----] followers, 11.6K engagements
"@rohanpaul_ai @MistralAI @nvidia @danielhanchen Thank you so much Rohan as always for supporting Unsloth π¦₯ Hope you will like Unsloth Studio (our upcoming UI) which will hopefully be out next week. π₯°"
X Link 2024-07-21T07:40Z [----] followers, [---] engagements
"@danielhanchen We have uploaded 4bit bnb quants for now and are working on Llama [---] support Llama [---] (8B) 4bit: Llama [---] (8B) Instruct 4bit: Llama [---] (70B) 4bit: Llama [---] (70B) Instruct 4bit: https://huggingface.co/unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit https://huggingface.co/unsloth/Meta-Llama-3.1-70B-bnb-4bit https://huggingface.co/unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit https://huggingface.co/unsloth/Meta-Llama-3.1-8B-bnb-4bit https://huggingface.co/unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit https://huggingface.co/unsloth/Meta-Llama-3.1-70B-bnb-4bit"
X Link 2024-07-23T17:25Z [----] followers, [----] engagements
"Llama [---] support is here Unsloth supports 48K context lengths for Llama [---] (70B) on a 80GB GPU - 6x longer than HF+FA2. QLoRA fine-tuning Llama [---] (70B) is 1.9x faster uses 65% less VRAM & Llama [---] (8B) is 2.1x faster and fits in a 8GB GPU Blog: https://unsloth.ai/blog/llama3-1 https://unsloth.ai/blog/llama3-1 https://unsloth.ai/blog/llama3-1 https://unsloth.ai/blog/llama3-1"
X Link 2024-07-23T20:14Z [----] followers, 11.1K engagements
"We just hit [--] million monthly downloads on @HuggingFace π¦₯π₯³ Over 13K models trained with Unsloth have also been uploaded to Hugging Face. Huge thanks to the Unsloth community the model teams and the HF team π€ http://huggingface.co/unsloth http://huggingface.co/unsloth"
X Link 2024-08-12T15:11Z [----] followers, [----] engagements
"Were excited to share that Unsloth is now backed by @YCombinator Building on our foundation in open-source fine-tuning were creating the all-in-one solution so you can focus on making the models you've always dreamed of without the complexity. With a focus on accuracy speed and accessibility we use math algorithms low-level languages (Triton CUDA) to innovate the LLM ecosystem through software not hardware. Join our waitlist: Read our roadmap: https://unsloth.ai/roadmap-yc https://unsloth.ai/waitlist https://unsloth.ai/roadmap-yc https://unsloth.ai/waitlist"
X Link 2024-09-05T15:27Z [----] followers, 73K engagements
"@rohanpaul_ai @danielhanchen @ycombinator Thank you so much Rohan for the constant support We really really appreciate it π€β₯ And excited to show you some of our new features"
X Link 2024-09-05T21:34Z [----] followers, [--] engagements
"@ynktk1 Hi there apologies. We have now enabled pip install unsloth and will also be working on a possible docker. Is there any particular reason why it was hard to install Thank you π€"
X Link 2024-09-06T00:03Z [----] followers, [---] engagements
"@LucasAtkins7 @danielhanchen @ycombinator Thank you Lucas for being a supporter from day one π"
X Link 2024-09-07T00:55Z [----] followers, [---] engagements
"Llama [---] versions including GGUF's + bnb [--] bit versions + reuploaded versions are now on @HuggingFace See all versions of Llama [---] here: We are actively working on supporting Vision models and 1B and 3B. https://huggingface.co/collections/unsloth/llama-32-66f46afde4ca573864321a22 https://huggingface.co/collections/unsloth/llama-32-66f46afde4ca573864321a22"
X Link 2024-09-25T20:19Z [----] followers, 14.2K engagements
"@snapolino @danielhanchen It's from Meta"
X Link 2024-09-26T02:01Z [----] followers, [--] engagements
"You can finetune Llama-3.2 for free on Colab now Unsloth makes finetuning 2x faster and uses 60% less VRAM with no accuracy degradation. Llama [---] (1B) QLoRA fits on a 4GB GPU and (3B) fits on 7GB. Vision support coming soon. Finetuning Colab: https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9usp=sharing https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9usp=sharing https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9usp=sharing https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9usp=sharing"
X Link 2024-09-26T16:23Z [----] followers, 102.3K engagements
"@TheCoinCollect8 It's definitely possible but yes very slow. I'd recommend using llama.cpp for this"
X Link 2024-09-26T21:15Z [----] followers, [---] engagements
"Today were releasing a new method that improves the way everyone trains LLMs. There's a significant bug that causes loss miscalculations during training. Our Gradient Accumulation fix corrects the issue reducing L2 norm error by 10x. Blog details: http://unsloth.ai/blog/gradient http://unsloth.ai/blog/gradient"
X Link 2024-10-15T16:46Z [----] followers, 28K engagements
"Join us & @GPU_Mode tomorrow at 3pm ET where we'll talk about our Gradient Accumulation Fix Triton + CUDA kernels & more. Thanks to @MarkSaroufim & @neurosp1ke for inviting us Meeting: https://discord.gg/enps8abKevent=1289330796015915178 https://discord.gg/enps8abKevent=1289330796015915178"
X Link 2024-10-18T19:28Z 32K followers, [----] engagements
"You can finetune Qwen-2.5-Coder-14B for free on Colab now Unsloth makes finetuning 2x faster & uses 60% less VRAM with no accuracy loss. We extended context lengths from 32K to 128K with YaRN & uploaded GGUFs: Finetuning Colab: https://colab.research.google.com/drive/18sN803sU23XuJV9Q8On2xgqHSer6-UZFusp=sharing https://huggingface.co/collections/unsloth/qwen-25-coder-all-versions-6732bc833ed65dd1964994d4 https://colab.research.google.com/drive/18sN803sU23XuJV9Q8On2xgqHSer6-UZFusp=sharing https://huggingface.co/collections/unsloth/qwen-25-coder-all-versions-6732bc833ed65dd1964994d4"
X Link 2024-11-12T19:49Z 12.2K followers, 61.7K engagements
"You can finetune Llama-3.2-Vision-11B for free on Colab now Unsloth finetunes VLMs 2x faster with 50% less VRAM 6x longer context - with no accuracy loss. Documentation: GitHub: Finetuning Colab: https://colab.research.google.com/drive/1j0N4XTY1zXXy7mPAhOC1_gMYZ2F2EBlkusp=sharing https://github.com/unslothai/unsloth https://docs.unsloth.ai/ https://colab.research.google.com/drive/1j0N4XTY1zXXy7mPAhOC1_gMYZ2F2EBlkusp=sharing https://github.com/unslothai/unsloth https://docs.unsloth.ai/ https://colab.research.google.com/drive/1j0N4XTY1zXXy7mPAhOC1_gMYZ2F2EBlkusp=sharing"
X Link 2024-11-21T18:39Z [----] followers, 69.7K engagements
"@TuanPham672604 @_xjdr @IlyasHairline @shah_bu_land @cloneofsimo This was actually a known issue for a long time. We already fixed this issue back in February when Gemma got released and worked with Hugging Face to implement the fixes. See here for more info: https://unsloth.ai/blog/gemma-bugs https://unsloth.ai/blog/gemma-bugs"
X Link 2024-11-29T04:44Z [----] followers, [--] engagements
"Were excited to introduce Unsloth Dynamic 4-bit Quantization Naive quantization often hurts accuracy making models unusable but we dynamically opt not to quantize certain parameters. Our approach delivers significant accuracy gains while only using 10% more VRAM than BitsandBytes 4-bit. Our tests show that standard 4-bit quants performed much worse than the original 16-bit versions while Unsloths Dynamic 4-bit quants provided very accurate & reliable results. Read our Blog: Dynamic 4-bit Quants on @HuggingFace: Colab notebook:"
X Link 2024-12-04T19:03Z 12.2K followers, 45.4K engagements
"Llama [---] versions including GGUF's + bnb 4-bit + original 16-bit are now on @HuggingFace See all versions of Llama [---] here: Fine-tuning for Llama [---] (70B) is also now supported Unsloth is 2x faster and uses 70% less memory. https://huggingface.co/collections/unsloth/llama-33-all-versions-67535d7d994794b9d7cf5e9f https://huggingface.co/collections/unsloth/llama-33-all-versions-67535d7d994794b9d7cf5e9f https://huggingface.co/collections/unsloth/llama-33-all-versions-67535d7d994794b9d7cf5e9f https://huggingface.co/collections/unsloth/llama-33-all-versions-67535d7d994794b9d7cf5e9f"
X Link 2024-12-06T21:47Z 22.4K followers, 40.3K engagements
"Llama [---] fine-tuning with ultra long context is here π¦ Unsloth now supports 89K context for @AIatMeta's Llama [---] (70B) on a 80GB GPU - 13x longer than HF+FA2 For Llama [---] (8B) Unsloth enables 342K context surpassing its native 128K support Blog: https://unsloth.ai/blog/llama3-3 https://unsloth.ai/blog/llama3-3 https://unsloth.ai/blog/llama3-3 https://unsloth.ai/blog/llama3-3"
X Link 2024-12-10T18:07Z [----] followers, 33.9K engagements
"@JoshPurtell @danielhanchen @6___0 Qwen QwQ is already supported. You just need to create your own dataset. π"
X Link 2024-12-21T20:30Z [----] followers, [--] engagements
"Learn how to fine-tune Llama for free in [--] mins In this video @jasonzhou1993 uses Unsloth to fine-tune Llama [---] (3B) with a custom dataset to significantly enhance MidJourney prompts. Jason covers the A-Z of fine-tuning including data prep with synthetic data evaluation free Colab training using Unsloth deployment & more in the full video Colab notebook: Documentation: Full video: https://www.youtube.com/watchv=jFl5Fewrieo https://docs.unsloth.ai/ https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9usp=sharing https://www.youtube.com/watchv=jFl5Fewrieo"
X Link 2024-12-31T17:32Z [----] followers, 22.5K engagements
"Deepseek V3 including GGUF + bf16 versions are now on @HuggingFace Min. requirements to run: 48GB RAM + 250GB of disk space for 2-bit. Includes [--] [--] [--] [--] [--] and 8-bit quantized versions. See all versions of Deepseek V3 & how to run it with examples: https://huggingface.co/collections/unsloth/deepseek-v3-all-versions-677cf5cfd7df8b7815fc723c https://huggingface.co/collections/unsloth/deepseek-v3-all-versions-677cf5cfd7df8b7815fc723c https://huggingface.co/collections/unsloth/deepseek-v3-all-versions-677cf5cfd7df8b7815fc723c"
X Link 2025-01-07T20:36Z 12K followers, 42.4K engagements
"Phi-4 including GGUF + 4-bit + 16-bit versions are now on @HuggingFace We found & fixed [--] bugs in Phi-4 & Llamafied the model. View all Phi-4 versions with our bug fixes: Phi-4 fine-tuning is also supported Unsloth is 2x faster & uses 70% less VRAM. https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa"
X Link 2025-01-08T23:31Z [----] followers, 21.9K engagements
"You can finetune Phi-4 for free on Colab now Unsloth finetunes LLMs 2x faster with 70% less VRAM 12x longer context - with no accuracy loss. GitHub repo: Documentation: Phi-4 Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Phi_4-Conversational.ipynb https://docs.unsloth.ai https://github.com/unslothai/unsloth https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Phi_4-Conversational.ipynb https://docs.unsloth.ai https://github.com/unslothai/unsloth"
X Link 2025-01-10T18:06Z 14.8K followers, 60.9K engagements
"You can finetune Phi-4 for free on @Kaggle now You'll learn how to: Prepare your dataset Train Phi-4 via Kaggle's free GPUs Run evaluate & save your model Unsloth finetunes LLMs 2x faster with 70% less VRAM & no accuracy loss. Phi-4 notebook: https://www.kaggle.com/code/danielhanchen/phi-4-unsloth-notebook https://www.kaggle.com/code/danielhanchen/phi-4-unsloth-notebook"
X Link 2025-01-16T17:23Z [----] followers, 26K engagements
"@levelsio DeepSeek R1 Distill Llama 8B seems to be the current most popular R1 GGUF and it will definitely run great on your laptop. We uploaded ALL of the GGUF files & they can be directly used with Jan AI llama.cpp Ollama HF etc: https://x.com/UnslothAI/status/1881357596717891955 DeepSeek-R1 GGUF's are now on @HuggingFace Includes all Llama & Qwen distilled models + [--] to 8-bit quantized versions. How to run R1: https://t.co/Ci22Tiu6fb DeepSeek-R1 Collection: https://t.co/JfVV5EA6qO https://x.com/UnslothAI/status/1881357596717891955 DeepSeek-R1 GGUF's are now on @HuggingFace Includes all"
X Link 2025-01-22T12:25Z [----] followers, [----] engagements
"@0xAsharib @tom_doerr @deepseek_ai We do but only for the distilled versions. You can read more in our blog here: https://unsloth.ai/blog/deepseek-r1 https://unsloth.ai/blog/deepseek-r1"
X Link 2025-01-26T20:52Z 11.2K followers, [--] engagements
"Introducing 1.58bit DeepSeek-R1 GGUFs π DeepSeek-R1 can now run in 1.58-bit while being fully functional. We shrank the 671B parameter model from 720GB to just 131GB - a 80% size reduction. Naively quantizing all layers breaks the model entirely causing endless loops & gibberish outputs. Our dynamic quants solve this. The 1.58-bit quant fits in 160GB VRAM (2x H100 80GB) for fast inference at [---] tokens/sec. By studying DeepSeek-R1s architecture we selectively quantized certain layers to higher bits (like 4-bit) and leave most MoE layers to 1.5-bit. Benchmarks + Blog: Dynamic GGUFs"
X Link 2025-01-27T15:25Z 19.9K followers, 685.1K engagements
"Run DeepSeek-R1 (671B) locally on @OpenWebUI - Full Guide No GPU required. Using our 1.58-bit Dynamic GGUF and llama.cpp. Tutorial: https://docs.openwebui.com/tutorials/integrations/deepseekr1-dynamic/ https://docs.openwebui.com/tutorials/integrations/deepseekr1-dynamic/ https://docs.openwebui.com/tutorials/integrations/deepseekr1-dynamic/ https://docs.openwebui.com/tutorials/integrations/deepseekr1-dynamic/"
X Link 2025-01-31T19:05Z 18.8K followers, 66.4K engagements
"@vega_holdings @OpenWebUI Oooh maybe really depends on demand. Issue is the imatrix quants will take a lot of time and money but we'll see. We might release it for V3 first - or maybe not :)"
X Link 2025-02-01T01:42Z 12.2K followers, [----] engagements
"Unsloth is the #1 trending repo on GitHub π¦₯ Its been an incredible journey and we couldnt have done it without you To celebrate were taking a look back at how it all started and how we got here: GitHub repo: http://github.com/unslothai/unsloth http://unsloth.ai/blog/reintroducing http://github.com/unslothai/unsloth http://unsloth.ai/blog/reintroducing"
X Link 2025-02-10T17:15Z 18.4K followers, 41.6K engagements
"Train your own reasoning LLM using DeepSeek's GRPO algorithm with our free notebook You'll transform Llama [---] (8B) to have chain-of-thought. Unsloth makes GRPO use 80% less VRAM. Guide: GitHub: Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/reasoning-grpo https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/reasoning-grpo"
X Link 2025-02-12T17:21Z 18.7K followers, 101.8K engagements
"Today were launching new algorithms that enable 10x longer context lengths & 90% less VRAM for training Reasoning Models (GRPO). Using Unsloth you can now train your own reasoning model with just 5GB VRAM for Qwen2.5-1.5B with no accuracy loss. Blog: https://unsloth.ai/blog/grpo https://unsloth.ai/blog/grpo https://unsloth.ai/blog/grpo https://unsloth.ai/blog/grpo"
X Link 2025-02-20T18:22Z 21.5K followers, 157.4K engagements
"Tutorial: Train your own Reasoning LLM for free Make Llama [---] (8B) have chain-of-thought with DeepSeek's GRPO. Unsloth enables 90% less VRAM use. Learn about: Reward Functions + dataset prep Training on free Colab GPUs Run + Evaluating Guide: https://docs.unsloth.ai/basics/reasoning-grpo-and-rl/tutorial-train-your-own-reasoning-model-with-grpo https://docs.unsloth.ai/basics/reasoning-grpo-and-rl/tutorial-train-your-own-reasoning-model-with-grpo https://docs.unsloth.ai/basics/reasoning-grpo-and-rl/tutorial-train-your-own-reasoning-model-with-grpo"
X Link 2025-02-25T17:22Z 22K followers, 62.7K engagements
"Unsloth now works on Windows π¦₯ Fine-tune LLMs locally on Windows without Linux or WSL. Tutorial: https://docs.unsloth.ai/get-started/installing-+-updating/windows-installation https://docs.unsloth.ai/get-started/installing-+-updating/windows-installation https://docs.unsloth.ai/get-started/installing-+-updating/windows-installation https://docs.unsloth.ai/get-started/installing-+-updating/windows-installation"
X Link 2025-03-05T17:12Z 20.6K followers, 32.2K engagements
"We made a Guide to teach you how to Fine-tune LLMs correctly Learn about: Choosing the right parameters & training method RL GRPO DPO & CPT Data prep Overfitting & Evaluation Training with Unsloth & deploy on vLLM Ollama Open WebUI π https://docs.unsloth.ai/get-started/fine-tuning-guide https://docs.unsloth.ai/get-started/fine-tuning-guide"
X Link 2025-03-10T16:16Z 20.6K followers, 93K engagements
"You can now fine-tune Gemma [--] for free with our notebook Unsloth makes Gemma [--] finetuning 1.6x faster with 60% less VRAM and 6x longer context lengths - with no accuracy loss. Blogpost: GitHub: Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3_(4B).ipynb https://github.com/unslothai/unsloth https://unsloth.ai/blog/gemma3 https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3_(4B).ipynb https://github.com/unslothai/unsloth https://unsloth.ai/blog/gemma3"
X Link 2025-03-14T18:05Z 20.2K followers, 128.5K engagements
"We teamed up with @HuggingFace to release a free notebook for fine-tuning Gemma [--] with GRPO Learn to: Enable reasoning in Gemma [--] (1B) Prepare/understand reward functions Make GRPO work for tiny LLMs Notebook: Details: https://huggingface.co/reasoning-course https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/HuggingFace%20Course-Gemma3_(1B)-GRPO.ipynb https://huggingface.co/reasoning-course https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/HuggingFace%20Course-Gemma3_(1B)-GRPO.ipynb https://huggingface.co/reasoning-course"
X Link 2025-03-19T16:26Z 21.8K followers, 92.7K engagements
"You can now Run DeepSeek-V3-0324 locally using our 2.71-bit Dynamic GGUF We shrank 720GB to 231GB (-70%) by selectively quantizing layers. 2.71bit passes many code tests producing nearly identical results to full 8bit Guide GGUF https://huggingface.co/unsloth/DeepSeek-V3-0324-GGUF https://docs.unsloth.ai/basics/tutorial-how-to-run-deepseek-v3-0324-locally https://huggingface.co/unsloth/DeepSeek-V3-0324-GGUF https://docs.unsloth.ai/basics/tutorial-how-to-run-deepseek-v3-0324-locally"
X Link 2025-03-26T02:08Z 25.9K followers, 47.7K engagements
"Another example including standard 2-bit which produces broken code. The 2.71-bit quant fits in 231GB VRAM (3 H100s) for fast throughput inference at [---] tokens/s. We also uploaded 1.78-bit etc. quants but for best results use our [----] or 2.71-bit quants. To run have at least 160GB combined VRAM + RAM. By studying V3s architecture we selectively quantize layers to higher bits (like 4-bit) and leave other MoE layers to lower bits (2.5-bit)"
X Link 2025-03-26T06:09Z 23.3K followers, [----] engagements
"We made a Guide on how to create Datasets for Fine-tuning Learn to: Curate high-quality datasets (with best practices & examples) Format datasets correctly for conversation SFT GRPO Vision etc. Generate synthetic data with Llama & ChatGPT https://docs.unsloth.ai/basics/datasets-guide https://docs.unsloth.ai/basics/datasets-guide https://docs.unsloth.ai/basics/datasets-guide https://docs.unsloth.ai/basics/datasets-guide"
X Link 2025-04-15T15:13Z 23.1K followers, 56.3K engagements
"Microsoft releases Phi-4 reasoning models You can now run them locally with our Dynamic GGUFs. Phi-4-reasoning-plus is only 14B parameters but performs on par with o1-mini o3-mini and Sonnet [---]. GGUFs: https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa Weve been cooking. a new open weights 14B Phi-4 reasoning model SFTd on 1.4M carefully curated reasoning demonstrations from o3-mini and RLd for a tiny bit. This model is a little beast. https://t.co/4xJuvYpZBH"
X Link 2025-05-01T05:03Z 23.4K followers, 83.2K engagements
"You can now fine-tune Qwen3 (14B) for free with our notebook Unsloth makes Qwen3 finetuning 2x faster with 70% less VRAM and 8x longer context lengths - with no accuracy loss. Guide: GitHub: Colab: https://colab.research.google.com/drive/1_ZJD6xqYDvhRbKSQeV8pThLBphcVB9Wnusp=sharing https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/qwen3-how-to-run-and-fine-tune https://colab.research.google.com/drive/1_ZJD6xqYDvhRbKSQeV8pThLBphcVB9Wnusp=sharing https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/qwen3-how-to-run-and-fine-tune"
X Link 2025-05-02T16:03Z 24K followers, 191.2K engagements
"Thank you @Google for demoing how to fine-tune Gemma [--] with Unsloth free on Colab π¦₯ #GoogleIO"
X Link 2025-05-20T23:57Z 24.4K followers, 18.1K engagements
"Mistral releases Devstral a new model for coding agents. Devstral-Small-2505 is now the #1 open-source LLM on SWE-Bench Verified. At 24B params & built with All Hands it scores 46.8% on SWE-Bench V - beating GPT-4.1-mini. Run & finetune via our GGUFs: https://huggingface.co/unsloth/Devstral-Small-2505-GGUF https://huggingface.co/unsloth/Devstral-Small-2505-GGUF Meet Devstral our SOTA open model designed specifically for coding agents and developed with @allhands_ai https://t.co/LwDJ04zapf https://t.co/Mm4lYZobGO https://huggingface.co/unsloth/Devstral-Small-2505-GGUF"
X Link 2025-05-21T15:13Z 24.5K followers, 20.5K engagements
"We just crossed [--] million monthly downloads on @HuggingFace π¦₯π€ It's all thanks to you guys - the amazing community model builders and HF team π"
X Link 2025-05-28T14:03Z 24.9K followers, 29.4K engagements
"You can now run DeepSeek-R1-0528 with our Dynamic 1-bit GGUFs π We shrank the full 715GB model to just 185GB (-75% size). We achieve optimal accuracy by selectively quantizing layers. DeepSeek-R1-0528-Qwen3-8B is also supported. GGUFs: https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF https://huggingface.co/unsloth/DeepSeek-R1-0528-GGUF"
X Link 2025-05-30T01:08Z 25.2K followers, 46.2K engagements
"Mistral releases Small [---] (24B) a new update to their [---] model. π₯ The model performs much better on 5-shot MMLU (CoT) instruction following and function/tool calling Run locally with FP8 or 16GB RAM using our Dynamic GGUFs with fixed chat template: https://huggingface.co/unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF https://huggingface.co/unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF https://huggingface.co/unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF https://huggingface.co/unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF"
X Link 2025-06-21T14:10Z 25.9K followers, 21.6K engagements
"We made a Guide on mastering LoRA Hyperparameters so you can learn to fine-tune LLMs correctly Learn to: Train smarter models with fewer hallucinations Choose optimal: learning rates epochs LoRA rank alpha Avoid overfitting & underfitting https://docs.unsloth.ai/get-started/fine-tuning-guide/lora-hyperparameters-guide https://docs.unsloth.ai/get-started/fine-tuning-guide/lora-hyperparameters-guide https://docs.unsloth.ai/get-started/fine-tuning-guide/lora-hyperparameters-guide https://docs.unsloth.ai/get-started/fine-tuning-guide/lora-hyperparameters-guide"
X Link 2025-06-24T14:41Z 26.1K followers, 25K engagements
"Run Gemma 3n locally with our Dynamic GGUFsβ¨ @Google's Gemma 3n supports audio vision video & text and the 4B model fits on 8GB RAM for fast local inference. Fine-tuning is also supported in Unsloth. Gemma-3n-E4B GGUF: https://huggingface.co/unsloth/gemma-3n-E4B-it-GGUF https://huggingface.co/unsloth/gemma-3n-E4B-it-GGUF Im so excited to announce Gemma 3n is here π πMultimodal (text/audio/image/video) understanding π€―Runs with as little as 2GB of RAM πFirst model under 10B with @lmarena_ai score of 1300+ Available now on @huggingface @kaggle llama.cpp https://t.co/CNDy479EEv and more"
X Link 2025-06-26T16:48Z 26.1K followers, 35.4K engagements
"You can now fine-tune Gemma 3n for free with our notebook Unsloth makes Google Gemma training 1.5x faster with 50% less VRAM and 5x longer context lengths - with no accuracy loss. Guide: GitHub: Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3N_(4B)-Conversational.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/gemma-3n-how-to-run-and-fine-tune#fine-tuning-gemma-3n-with-unsloth https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3N_(4B)-Conversational.ipynb https://github.com/unslothai/unsloth"
X Link 2025-07-01T15:32Z 31.6K followers, 87.5K engagements
"Weve teamed up with @GoogleDeepMind for a challenge with a $10000 Unsloth prize π¦₯ Show off your best fine-tuned Gemma 3n model using Unsloth optimized for an impactful task. The entire hackathon has $150000 prizes to be won Kaggle notebook: https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference"
X Link 2025-07-02T14:17Z 31.6K followers, 47.8K engagements
"The Unsloth Gemma 3n Kaggle notebook can be used for any submission to the $150000 challenges (not just the Unsloth specific one). Gemma 3n competition details: https://www.kaggle.com/competitions/google-gemma-3n-hackathon https://www.kaggle.com/competitions/google-gemma-3n-hackathon"
X Link 2025-07-02T14:49Z 31.5K followers, [----] engagements
"We made step-by-step guides to Fine-tune & Run every single LLM π¦₯ What you'll learn: Technical analysis + Bug fixes explained for each model Best practices & optimal settings How to fine-tune with our notebooks Directory of model variants https://docs.unsloth.ai/basics/tutorials-how-to-fine-tune-and-run-llms https://docs.unsloth.ai/basics/tutorials-how-to-fine-tune-and-run-llms https://docs.unsloth.ai/basics/tutorials-how-to-fine-tune-and-run-llms https://docs.unsloth.ai/basics/tutorials-how-to-fine-tune-and-run-llms"
X Link 2025-07-08T13:47Z 27.8K followers, 50.8K engagements
"Mistral releases Devstral [----] the best open-source model for coding agents π₯ The 24B model is now the #1 open LLM on SWE-Bench Verified scoring 52.4% Run Devstral-Small-2507 locally on 32GB RAM with our Dynamic quants & fine-tune with Unsloth GGUFs: https://huggingface.co/unsloth/Devstral-Small-2507-GGUF https://huggingface.co/unsloth/Devstral-Small-2507-GGUF Introducing Devstral Small and Medium [----] This latest update offers improved performance and cost efficiency perfectly suited for coding agents and software engineering tasks. https://t.co/l6MacctLrv"
X Link 2025-07-10T14:31Z 26.7K followers, 24.8K engagements
"You can now run Kimi K2 locally with our Dynamic 1.8-bit GGUFs We shrank the full 1.1TB model to just 245GB (-80% size reduction). The 2-bit XL GGUF performs exceptionally well on coding & passes all our code tests Guide: GGUFs: https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF https://docs.unsloth.ai/basics/kimi-k2 https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF https://docs.unsloth.ai/basics/kimi-k2 https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF https://docs.unsloth.ai/basics/kimi-k2 https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF https://docs.unsloth.ai/basics/kimi-k2"
X Link 2025-07-14T15:27Z 27.8K followers, 128.9K engagements
"For fast inference of 5+ tokens/s try to have your RAM + VRAM combined = the size of quant (e.g. 256GB). If not the model will still run with llama.cpp offloading but be slower. Kimi K2 GGUF: https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF https://huggingface.co/unsloth/Kimi-K2-Instruct-GGUF"
X Link 2025-07-14T16:39Z 27.1K followers, [----] engagements
"A Complete Guide to Fine-tuning LLMs in [--] mins Learn to: Choose the correct model & training method (LoRA FFT GRPO) Build Datasets & Chat templates Train with Unsloth notebooks Run & deploy your LLM in llama.cpp Ollama & Open WebUI Docs: https://docs.unsloth.ai/ https://docs.unsloth.ai/"
X Link 2025-07-16T13:53Z 28.7K followers, 30.7K engagements
"@Alibaba_Qwen Congrats guys on the release β¨ We're working on Dynamic quants & GGUFs so the community can run it locally π€"
X Link 2025-07-21T17:58Z 27.3K followers, 12.4K engagements
"You can now run Qwen3-235B-A22B-2507 with our Dynamic 2-bit GGUFs The full 250GB model gets reduced to just 88GB (-65% size). Achieve [--] tokens/s on 89GB unified memory or 80GB RAM + 8GB VRAM. GGUFs: https://huggingface.co/unsloth/Qwen3-235B-A22B-Instruct-2507-GGUF https://huggingface.co/unsloth/Qwen3-235B-A22B-Instruct-2507-GGUF Bye Qwen3-235B-A22B hello Qwen3-235B-A22B-2507 After talking with the community and thinking it through we decided to stop using hybrid thinking mode. Instead well train Instruct and Thinking models separately so we can get the best quality possible. Today were"
X Link 2025-07-22T12:31Z 27.8K followers, 37.3K engagements
"@Alibaba_Qwen Congrats guys on another epic release We're uploading Dynamic GGUFs and one with 1M context length so you guys can run it locally π¦₯ https://huggingface.co/unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF https://huggingface.co/unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF"
X Link 2025-07-22T21:20Z 27.8K followers, 17.2K engagements
"Run Qwen3-Coder with our Dynamic 2-bit GGUFs We shrank the 480B parameter model to just 182GB (down from 512GB). Also run with 1M context length. Achieve [--] tokens/s on 182GB unified memory or 158GB RAM + 24GB VRAM. Qwen3-Coder-480B-A35B GGUFs: https://huggingface.co/unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF https://huggingface.co/unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF Qwen3-Coder is here β
Were releasing Qwen3-Coder-480B-A35B-Instruct our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to"
X Link 2025-07-23T01:26Z 27.8K followers, 25.2K engagements
"@Alibaba_Qwen Congrats guys You can run Qwen3-235B-A22B-Thinking-2507 with our Dynamic GGUFs π₯° Run in 2-bit with 88GB unified mem or RAM for 6+ tokens/s. https://huggingface.co/unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF https://huggingface.co/unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF"
X Link 2025-07-25T10:28Z 27.8K followers, 10.6K engagements
"You can now run Qwen3-235B-A22B-Thinking-2507 with our Dynamic 2-bit GGUFs The full 250GB model gets reduced to just 87GB (-65% size). Achieve [--] tokens/s on 88GB unified memory or 80GB RAM + 8GB VRAM. GGUFs: https://huggingface.co/unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF https://huggingface.co/unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF π Were excited to introduce Qwen3-235B-A22B-Thinking-2507 our most advanced reasoning model yet Over the past [--] months weve significantly scaled and enhanced the thinking capability of Qwen3 achieving: β
Improved performance in logical reasoning math science"
X Link 2025-07-25T10:34Z 28.1K followers, 19.6K engagements
"@Alibaba_Qwen Thanks for releasing a smaller model guys π₯° You can now run Qwen3-30B-A3B-0527 using Dynamic GGUFs. Only 33GB RAM or unified mem is needed to run the full 8-bit precision model at [--] tokens/s. https://huggingface.co/unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF https://huggingface.co/unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF"
X Link 2025-07-29T16:23Z 27.7K followers, [----] engagements
"Qwen3-30B-A3B-Instruct-2507 is hereβ¨ The 30B model rivals GPT-4o's performance and runs locally in full precision with just 33GB RAM. Run locally with Unsloth Dynamic GGUFs. Unsloth also supports Qwen3 fine-tuning and RL. GGUF: https://huggingface.co/unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF π Qwen3-30B-A3B Small Update: Smarter faster and local deployment-friendly. β¨ Key Enhancements: β
Enhanced reasoning coding and math skills β
Broader multilingual knowledge β
Improved long-context understanding (up to 256K tokens) β
Better alignment with user intent https://t.co/zsKfKJ2NRG"
X Link 2025-07-29T16:32Z 29.7K followers, 57.8K engagements
"The @GoogleDeepMind Gemma 3n Challenge ($150000 in prizes) ends in [--] days We've made [--] new fine-tuning Gemma 3n Kaggle notebooks (Vision & Audio) to spark your creativity. Your fine-tuned model can compete for any prize. Notebooks + Challenge Details: https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference https://www.kaggle.com/code/danielhanchen/gemma-3n-4b-multimodal-finetuning-inference"
X Link 2025-07-30T13:46Z 28.8K followers, 10.9K engagements
"Qwen3-Coder-Flash is hereβ¨ The 30B model excels in coding & agentic tasks. Run locally with up to 1M context length. Full precision runs with just 33GB RAM. We also fixed tool-calling support for Qwen3-Coder-30B-A3B-Instruct and 480B-A3B. GGUFs: https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF π¦₯ Qwen3-Coder-Flash: Qwen3-Coder-30B-A3B-Instruct π Just lightning-fast accurate code generation. β
Native 256K context (supports up to 1M tokens with YaRN) β
Optimized for platforms like Qwen Code Cline Roo Code Kilo"
X Link 2025-07-31T15:02Z 32.3K followers, 44.6K engagements
"@Alibaba_Qwen You can still run any quant without meeting the 33GB RAM requirements - it'll just be slower. Maximum memory is only needed for the optimal speeds or more context. Here's the link to the 1M context Qwen3-Coder-Flash GGUF: https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-1M-GGUF https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-1M-GGUF"
X Link 2025-07-31T15:23Z 28.8K followers, [----] engagements
"@mrgshum @OpenAI You can still run it with 64GB RAM it'll just be slower. Remember you can run any model size no matter how much compute you have. We're also working on smaller quants once llama.cpp adds in support for it :)"
X Link 2025-08-05T20:22Z 28.8K followers, [----] engagements
"@sama Thank you guys for supporting open-source You can now run the 20B and 120B models locally with our GGUFs π₯° https://huggingface.co/unsloth/gpt-oss-20b-GGUF https://huggingface.co/unsloth/gpt-oss-20b-GGUF"
X Link 2025-08-05T21:21Z 28.8K followers, 10.6K engagements
"You can now fine-tune OpenAI gpt-oss for free with our notebook Unsloth trains 1.5x faster with -70% VRAM 10x longer context & no accuracy loss. 20b fits in 14GB & 120b in 65GB GPU. Guide: GitHub: Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-Fine-tuning.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/gpt-oss https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-Fine-tuning.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/gpt-oss"
X Link 2025-08-08T19:12Z 31.9K followers, 196.1K engagements
"Learn to fine-tune OpenAI gpt-oss with our new step-by-step guide Learn about: Local gpt-oss training + inference FAQ & tips Evaluation hyperparameters & overfitting Reasoning effort Data prep Run & saving your LLM to llama.cpp GGUF HF https://docs.unsloth.ai/basics/tutorial-how-to-fine-tune-gpt-oss https://docs.unsloth.ai/basics/tutorial-how-to-fine-tune-gpt-oss https://docs.unsloth.ai/basics/tutorial-how-to-fine-tune-gpt-oss https://docs.unsloth.ai/basics/tutorial-how-to-fine-tune-gpt-oss"
X Link 2025-08-18T14:02Z 31.2K followers, 41.9K engagements
"@QuixiAI @deepseek_ai Thanks Eric & everyone we really appreciate the support Huge thanks to @ggerganov and the llama.cpp team for making this possible as well and of course to the DeepSeek team π₯°"
X Link 2025-08-23T22:59Z 30.7K followers, [---] engagements
"@elonmusk @xai Thanks for supporting open-source We'll try to investigate how we can create Dynamic GGUFs so everyone can run it locally π"
X Link 2025-08-23T23:05Z 30.7K followers, 50.1K engagements
"RL used to be memory hungry but not anymore Introducing our new kernels & algos that allows faster RL with 50% less VRAM [--] more context & no accuracy loss. RL before required GPU splitting between training & inference. Now with Standby you don't http://docs.unsloth.ai/basics/memory-efficient-rl http://docs.unsloth.ai/basics/memory-efficient-rl"
X Link 2025-09-04T16:02Z 31.7K followers, 69.5K engagements
"You can now run @xAI Grok [---] locally on just 120GB RAM π The 270B parameter model runs [--] t/s on a 128GB Mac with our Dynamic 3-bit GGUF. We shrunk the 539GB model to 118GB (-80%) & left key layers in higher 8-bits Guide: GGUF: https://huggingface.co/unsloth/grok-2-GGUF https://docs.unsloth.ai/basics/grok-2 https://huggingface.co/unsloth/grok-2-GGUF https://docs.unsloth.ai/basics/grok-2 https://huggingface.co/unsloth/grok-2-GGUF https://docs.unsloth.ai/basics/grok-2 https://huggingface.co/unsloth/grok-2-GGUF https://docs.unsloth.ai/basics/grok-2"
X Link 2025-09-08T13:41Z 31.9K followers, 109K engagements
"Unsloth Dynamic GGUFs were introduced early this year where we selectively quantized some layers to as low as 1-bit and important layers to higher bits (6 8-bit). Blog post: Our Dynamic GGUFs consistently performs better on Aider Polyglot when compared to other community quants for the same model size and quant type. To ensure a fair comparison we do the following: We select similar sized files and bit types to each Unsloth quant. We use our fixed chat template if the community quant fails to execute the benchmark. We found some community quants having errors and this gets fixed by using our"
X Link 2025-09-10T15:21Z 31.6K followers, [----] engagements
"You can now train Vision LLMs with Reinforcement Learning in our free notebook Unsloth VLM RL via GRPO: [---] faster 90% less VRAM [--] longer context & no accuracy loss. Guide: GitHub: Qwen2.5-VL Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2_5_7B_VL_GRPO.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/new/vision-reinforcement-learning-vlm-rl https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2_5_7B_VL_GRPO.ipynb https://github.com/unslothai/unsloth"
X Link 2025-09-16T16:24Z 32.8K followers, 142.2K engagements
"@Alibaba_Qwen We're all super excited for Qwen3-VL π₯°π"
X Link 2025-09-17T04:07Z 31.6K followers, [----] engagements
"Mistral releases Magistral [---] their new reasoning models π₯ Magistral-Small-2509 excels at coding + math and is a major upgrade over Magistral [---]. Run the 24B model locally with 32GB RAM. Fine-tune with free notebook: GGUFs: https://huggingface.co/unsloth/Magistral-Small-2509-GGUF https://docs.unsloth.ai/models/magistral-how-to-run-and-fine-tune#fine-tuning-magistral-with-unsloth https://huggingface.co/unsloth/Magistral-Small-2509-GGUF https://docs.unsloth.ai/models/magistral-how-to-run-and-fine-tune#fine-tuning-magistral-with-unsloth https://huggingface.co/unsloth/Magistral-Small-2509-GGUF"
X Link 2025-09-17T15:55Z 32.4K followers, 51.2K engagements
"@deepseek_ai Thank you for another update We're excited to make Dynamic GGUFs so you all can run it locally π https://huggingface.co/unsloth/DeepSeek-V3.1-Terminus-GGUF https://huggingface.co/unsloth/DeepSeek-V3.1-Terminus-GGUF"
X Link 2025-09-22T13:42Z 32K followers, 20.8K engagements
"We're teaming up with @MistralAI and @NVIDIA for an Unsloth event on Tues Oct [--] at @YCombinator's office π¦₯ Join us in San Francisco for a night of talks merch and more. Food & drinks provided. RSVP required http://lu.ma/unsloth-yc http://lu.ma/unsloth-yc http://lu.ma/unsloth-yc http://lu.ma/unsloth-yc"
X Link 2025-09-22T14:05Z 32.3K followers, 30.2K engagements
"You can now train OpenAI gpt-oss with Reinforcement Learning in our free notebook This notebook automatically creates faster kernels via RL. Unsloth RL achieves the fastest inference & lowest VRAM vs. any setup - [--] accuracy loss gpt-oss-20b GRPO Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-GRPO.ipynb https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-GRPO.ipynb https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-GRPO.ipynb"
X Link 2025-09-26T15:45Z 32.7K followers, 123.4K engagements
"The notebook shows how to counteract reward-hacking which is one of RL's biggest challenges. Blog + details: Since inference is crucial and vLLM is incompatible with gpt-oss RL we developed custom algorithms in Unsloth to deliver the fastest inference (3 faster) the lowest VRAM usage (50% less) and longest context lengths (8 more) - without any accuracy degradation. https://docs.unsloth.ai/new/gpt-oss-reinforcement-learning https://docs.unsloth.ai/new/gpt-oss-reinforcement-learning https://docs.unsloth.ai/new/gpt-oss-reinforcement-learning"
X Link 2025-09-26T15:45Z 32.6K followers, [----] engagements
"Join us @Pytorch and @AMD for a Virtual Hackathon on Oct 18-20. π₯ Win $10K in prizes by training the best AI agent via Unsloth Sign up here: https://luma.com/4i64p3ec https://luma.com/4i64p3ec"
X Link 2025-09-27T17:11Z 32.4K followers, [---] engagements
"@deepseek_ai Thank you guys once again for supporting open-source and making AI more accessible Hopefully we'll be able to make GGUFs to allow everyone to run DeepSeek-V3.2-Exp locally π"
X Link 2025-09-29T12:22Z 32.6K followers, 21.7K engagements
"LoRA in reinforcement learning (RL) can match full-finetuning performance when done right π‘ A new @thinkymachines post shows how using 10x larger learning rates applying LoRA on all layers & more LoRA at rank=1 even works. We're excited to have collaborated on this blog LoRA makes fine-tuning more accessible but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post we share our experimental results and recommendations for LoRA. https://t.co/DcVmUKeOyw LoRA makes fine-tuning"
X Link 2025-09-29T20:59Z 32.7K followers, 61.7K engagements
"IBM releases Granite-4.0 their new series of open models Run the 'Micro' 3B model on 4GB RAM or 'Small' 32B on 40GB RAM. Granite-4.0 excels at agentic tasks doc analysis RAG edge AI applications & more Dynamic GGUFs: Guide: https://docs.unsloth.ai/new/ibm-granite-4.0 https://huggingface.co/collections/unsloth/granite-40-68ddf64b4a8717dc22a9322d https://docs.unsloth.ai/new/ibm-granite-4.0 https://huggingface.co/collections/unsloth/granite-40-68ddf64b4a8717dc22a9322d https://docs.unsloth.ai/new/ibm-granite-4.0 https://huggingface.co/collections/unsloth/granite-40-68ddf64b4a8717dc22a9322d"
X Link 2025-10-02T14:14Z 32.8K followers, 42.7K engagements
"@Alibaba_Qwen Go Qwen team Thank you for releasing smaller models π"
X Link 2025-10-04T01:59Z 32.7K followers, [----] engagements
"@AMD @OpenAI Congrats We're also very excited to enable local efficient fine-tuning and reinforcement learning for AMD GPUs very soon ππ¦₯"
X Link 2025-10-06T12:48Z 32.5K followers, [----] engagements
"Thank you @dkundel from OpenAI and Barath from NVIDIA for the collab. π₯° Watch Dominik's full gpt-oss presentation: https://www.youtube.com/watchv=1HL2YHRj270 https://www.youtube.com/watchv=1HL2YHRj270"
X Link 2025-10-09T14:23Z 32.8K followers, [----] engagements
"DeepSeek-R1 GGUF's are now on @HuggingFace Includes all Llama & Qwen distilled models + [--] to 8-bit quantized versions. How to run R1: DeepSeek-R1 Collection: https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5 https://unsloth.ai/blog/deepseek-r1 https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5 https://unsloth.ai/blog/deepseek-r1 https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5 https://unsloth.ai/blog/deepseek-r1"
X Link 2025-01-20T15:06Z 18.3K followers, 68.4K engagements
"You can now reproduce DeepSeek-R1's reasoning on your own local device Experience the "Aha" moment with just 7GB VRAM. Unsloth reduces GRPO training memory use by 80%. 15GB VRAM can transform Llama-3.1 (8B) & Phi-4 (14B) into reasoning models. Blog: http://unsloth.ai/blog/r1-reasoning http://unsloth.ai/blog/r1-reasoning http://unsloth.ai/blog/r1-reasoning http://unsloth.ai/blog/r1-reasoning"
X Link 2025-02-06T18:03Z 24K followers, [----] engagements
"You can now fine-tune TTS models with Unsloth Train run and save models like Sesame-CSM and OpenAI's Whisper locally with our free notebooks. Unsloth makes TTS training 1.5x faster with 50% less VRAM. GitHub: Docs & Notebooks: https://docs.unsloth.ai/basics/text-to-speech-tts-fine-tuning https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/text-to-speech-tts-fine-tuning https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/text-to-speech-tts-fine-tuning https://github.com/unslothai/unsloth https://docs.unsloth.ai/basics/text-to-speech-tts-fine-tuning"
X Link 2025-05-15T16:38Z 33.8K followers, 127.3K engagements
"We made a repo with 100+ Fine-tuning notebooks all in once place Has guides & examples for: Tool-calling Classification Synthetic data BERT TTS Vision LLMs GRPO DPO SFT CPT Dataprep eval saving Llama Qwen Gemma Phi DeepSeek https://github.com/unslothai/notebooks/ https://github.com/unslothai/notebooks/ https://github.com/unslothai/notebooks/ https://github.com/unslothai/notebooks/"
X Link 2025-06-04T13:29Z 36.7K followers, 84.7K engagements
"We made a complete Guide on Reinforcement Learning for LLMs Learn about: RL's goal & why it's key to building intelligent AI agents Why o3 Claude [--] & R1 use RL GRPO RLHF DPO reward functions Training your own local R1 model via Unsloth https://docs.unsloth.ai/basics/reinforcement-learning-guide https://docs.unsloth.ai/basics/reinforcement-learning-guide"
X Link 2025-06-17T14:36Z 39.7K followers, 70.5K engagements
"You can now run the worlds most powerful Western open models locally The hybrid reasoning 671B model matches o3 & Claude-4-Opus in performance. Trained on Llama [--] & DeepSeek-R1 Cogito-v2 has [--] variantseach setting new benchmarks. Guide + GGUFs: https://docs.unsloth.ai/basics/tutorials-how-to-fine-tune-and-run-llms/cogito-v2-how-to-run-locally Today we are releasing [--] hybrid reasoning models of sizes 70B 109B MoE 405B 671B MoE under open license. These are some of the strongest LLMs in the world and serve as a proof of concept for a novel AI paradigm - iterative self-improvement (AI systems"
X Link 2025-08-01T00:49Z 33.3K followers, 40.7K engagements
"@OpenAI Amazing guys Super excited to support them so y'all can run & fine-tune them locally π€©"
X Link 2025-08-05T17:05Z 36.6K followers, 27.6K engagements
"You can now run gpt-oss-120b & 20b locally with our GGUFs π¦₯ Run OpenAI's 120b model on 66GB RAM & 20b model on 14GB RAM. Both in original precision. Uploads includes our chat template fixes. Guide: GGUF: https://huggingface.co/unsloth/gpt-oss-20b-GGUF https://docs.unsloth.ai/basics/gpt-oss https://huggingface.co/unsloth/gpt-oss-20b-GGUF https://docs.unsloth.ai/basics/gpt-oss Our open models are here. Both of them. https://t.co/9tFxefOXcg https://huggingface.co/unsloth/gpt-oss-20b-GGUF https://docs.unsloth.ai/basics/gpt-oss https://huggingface.co/unsloth/gpt-oss-20b-GGUF"
X Link 2025-08-05T20:10Z 38.9K followers, 95.8K engagements
"Google releases Gemma [--] 270M a new model that runs locally on just [---] GB RAM.β¨ Trained on 6T tokens it runs fast on phones & handles chat coding & math. Run at [--] t/s with our Dynamic GGUF or fine-tune via Unsloth & export to your phone. Details: https://docs.unsloth.ai/basics/gemma-3-how-to-run-and-fine-tune https://docs.unsloth.ai/basics/gemma-3-how-to-run-and-fine-tune Introducing Gemma [--] 270M π₯ π€A tiny model Just [---] million parameters π§ Very strong instruction following π€ Fine-tune in just a few minutes with a large vocabulary to serve as a high-quality foundation"
X Link 2025-08-14T16:18Z 33.8K followers, 156.6K engagements
"Can a 1-bit or 3-bit quantized model outperform GPT-4.1 or Claude-Opus-4 Yes Today we're excited to show how LLMs like DeepSeek-V3.1 can be quantized to just 1-bit or 3-bit and still beat SOTA models like Claude-Opus-4 (thinking) on Aider Polyglot. Details and blog below"
X Link 2025-09-10T15:21Z 38.1K followers, 165.4K engagements
"We made a free notebook that fine-tunes IBM Granite [---] into a powerful support agent This agent will enable real-time analysis & solving of customer interactions. You'll also learn how to train models using data from Google Sheets. Colab Notebook: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Granite4.0.ipynb https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Granite4.0.ipynb https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Granite4.0.ipynb"
X Link 2025-10-02T15:37Z 33.4K followers, 50.6K engagements
"OpenAI shows how gpt-oss can autonomously beat [----] using reinforcement learning (RL). Training was done locally with Unsloth on NVIDIA DGX Spark. You can also do it free on Colab. π¦₯ OpenAI DevDay notebook: https://github.com/openai/gpt-oss/blob/main/examples/reinforcement-fine-tuning.ipynb https://github.com/openai/gpt-oss/blob/main/examples/reinforcement-fine-tuning.ipynb https://github.com/openai/gpt-oss/blob/main/examples/reinforcement-fine-tuning.ipynb https://github.com/openai/gpt-oss/blob/main/examples/reinforcement-fine-tuning.ipynb"
X Link 2025-10-09T13:50Z 33.7K followers, 97.6K engagements
"You can now train models up to 200B parameters locally on NVIDIA DGX Spark with Unsloth π¦₯ Fine-tune RL & deploy OpenAI gpt-oss-120b via our free notebook in 68GB unified memory: Read our step-by-step guide in collab with NVIDIA https://docs.unsloth.ai/new/fine-tuning-llms-with-nvidia-dgx-spark-and-unsloth https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(120B)_A100-Fine-tuning.ipynb https://docs.unsloth.ai/new/fine-tuning-llms-with-nvidia-dgx-spark-and-unsloth"
X Link 2025-10-15T13:43Z 33.8K followers, 46.4K engagements
"You can now fine-tune Qwen3-VL (8B) for free with our notebook Unsloth trains VLMs 1.7x faster with 60% less VRAM and 8x longer context - no accuracy loss. GitHub: Qwen3-VL GRPO Colab: Qwen3-VL Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_VL_(8B)-Vision.ipynb https://docs.unsloth.ai/models/qwen3-vl-run-and-fine-tune#fine-tuning-qwen3-vl https://github.com/unslothai/unsloth https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_VL_(8B)-Vision.ipynb https://docs.unsloth.ai/models/qwen3-vl-run-and-fine-tune#fine-tuning-qwen3-vl"
X Link 2025-10-16T13:51Z 34.9K followers, 107.9K engagements
"We just hit [---] million lifetime downloads on Hugging Face π¦₯π€ Huge thanks to all of you The amazing community model creators and HF team. π"
X Link 2025-10-21T13:45Z 33.4K followers, 31.7K engagements
"You can now quantize LLMs to 4-bit and recover 70% accuracy via Quantization-Aware Training. We teamed up with @PyTorch to show how QAT enables: 4x less VRAM with no inference overhead 1-3% increase in raw accuracy (GPQA MMLU Pro) Notebook & Blog: https://docs.unsloth.ai/new/quantization-aware-training-qat https://docs.unsloth.ai/new/quantization-aware-training-qat https://docs.unsloth.ai/new/quantization-aware-training-qat https://docs.unsloth.ai/new/quantization-aware-training-qat"
X Link 2025-10-22T15:36Z 34K followers, 44.9K engagements
"We showcased our one click fine-tuning UI for the first time at the NVIDIA x Mistral AI x Unsloth event at Y Combinator π₯π¦₯ Huge thanks to everyone who came π₯° π Thank you to everyone who joined us at AI Dev Night with @UnslothAI and @MistralAI. We're looking forward to meeting more of you at #PyTorchCon #OpenSourceAIWeek. https://t.co/xCJrGMrbZ4 π Thank you to everyone who joined us at AI Dev Night with @UnslothAI and @MistralAI. We're looking forward to meeting more of you at #PyTorchCon #OpenSourceAIWeek. https://t.co/xCJrGMrbZ4"
X Link 2025-10-23T19:54Z 38.5K followers, 14.4K engagements
"We teamed up with @NVIDIA to teach you how to fine-tune LLMs on Blackwell & RTX [--] GPUs. Unsloth makes training on Blackwell up to [--] faster with 70% less VRAM - no accuracy loss. Learn how to use our new Docker image & more in the official NVIDIA Blog: https://developer.nvidia.com/blog/train-an-llm-on-an-nvidia-blackwell-desktop-with-unsloth-and-scale-it/ https://developer.nvidia.com/blog/train-an-llm-on-an-nvidia-blackwell-desktop-with-unsloth-and-scale-it/ https://developer.nvidia.com/blog/train-an-llm-on-an-nvidia-blackwell-desktop-with-unsloth-and-scale-it/"
X Link 2025-10-27T14:02Z 34K followers, 35.2K engagements
"You can now run Qwen3-VL locally π Run the 235B variant for SOTA vision/OCR on 128GB unified memory (dynamic 4-bit). Includes our chat template fixes. Qwen3-VL-2B runs at [--] t/s on 4GB RAM. Fine-tune & RL via Unsloth free notebooks & export to GGUF. https://docs.unsloth.ai/models/qwen3-vl https://docs.unsloth.ai/models/qwen3-vl https://docs.unsloth.ai/models/qwen3-vl https://docs.unsloth.ai/models/qwen3-vl"
X Link 2025-10-31T13:31Z 39.7K followers, 92.8K engagements
"To run Qwen3-VL you can read our step-by-step tutorial and download the GGUFs from our Hugging Face collection: https://huggingface.co/collections/unsloth/qwen3-vl https://huggingface.co/collections/unsloth/qwen3-vl"
X Link 2025-10-31T15:26Z 34.1K followers, [----] engagements
"@Alibaba_Qwen Thank you for the support ππ¦₯ Here's our free Colab notebooks for fine-tuning and reinforcement learning (RL) of Qwen3-VL-8B: https://x.com/UnslothAI/status/1978821090135687182 You can now fine-tune Qwen3-VL (8B) for free with our notebook Unsloth trains VLMs 1.7x faster with 60% less VRAM and 8x longer context - no accuracy loss. GitHub: https://t.co/aZWYAt9MMh Qwen3-VL GRPO Colab: https://t.co/HkjYydXDnR Qwen3-VL Colab: https://t.co/r3p2wgIzVS https://x.com/UnslothAI/status/1978821090135687182 You can now fine-tune Qwen3-VL (8B) for free with our notebook Unsloth trains VLMs"
X Link 2025-11-02T03:18Z 34K followers, [----] engagements
"You can now fine-tune DeepSeek-OCR with our free notebook We fine-tuned DeepSeek-OCR improving its language understanding by 89% and reduced Character Error Rate from 149% to 60% Blog: GitHub: Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Deepseek_OCR_(3B)-Eval.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/new/deepseek-ocr https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Deepseek_OCR_(3B)-Eval.ipynb https://github.com/unslothai/unsloth https://docs.unsloth.ai/new/deepseek-ocr"
X Link 2025-11-04T15:20Z 34.4K followers, 80.6K engagements
"@donvito Most models with up to 32B parameters (e.g. Qwen3-32B) can fine-tune locally with Unsloth on a 24GB VRAM GPU. π₯° LoRA or FFT will use much more VRAM though. You can find more details about this in our docs"
X Link 2025-11-04T16:42Z 34.1K followers, [----] engagements
"You can now run Kimi K2 Thinking locally with our Dynamic 1-bit GGUFs We shrank the 1T model to 245GB (-62%) & retained 85% of accuracy. Run on 247GB RAM. We also worked with the Kimi team on a system prompt fix. Guide: GGUF: https://huggingface.co/unsloth/Kimi-K2-Thinking-GGUF https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally https://huggingface.co/unsloth/Kimi-K2-Thinking-GGUF https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally https://huggingface.co/unsloth/Kimi-K2-Thinking-GGUF https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally"
X Link 2025-11-08T15:43Z 36.7K followers, 178.2K engagements
"You can also run Kimi K2 Thinking in full precision by using our 4-bit or 5-bit GGUFs since the original model was released as INT4. π This will require 520GB - 730GB RAM/VRAM for fast inference"
X Link 2025-11-08T17:13Z 36.7K followers, [----] engagements
"You can now run Unsloth GGUFs locally via Docker Run LLMs on Mac or Windows with one line of code or no code at all We collabed with Docker to make Dynamic GGUFs available for everyone Just run: docker model run ai/gpt-oss:20B Guide: https://docs.unsloth.ai/models/how-to-run-llms-with-docker https://docs.unsloth.ai/models/how-to-run-llms-with-docker https://docs.unsloth.ai/models/how-to-run-llms-with-docker https://docs.unsloth.ai/models/how-to-run-llms-with-docker"
X Link 2025-11-17T14:33Z 35.3K followers, 93.2K engagements
"We made a guide on how to deploy LLMs locally with SGLang In collab with @lmsysorg you'll learn to: Deploy fine-tuned LLMs for large scale production Serve GGUFs locally Benchmark inference speed Use on the fly FP8 for 1.6x inference Guide: https://docs.unsloth.ai/basics/inference-and-deployment/sglang-guide https://docs.unsloth.ai/basics/inference-and-deployment/sglang-guide"
X Link 2025-11-21T14:40Z 35.2K followers, 29K engagements
"You can now run FP8 reinforcement learning on consumer GPUs Try DeepSeek-R1s FP8 GRPO at home using only a 5GB GPU. Qwen3-1.7B fits in 5GB VRAM. We collabed with PyTorch to make FP8 RL inference [---] faster. Unsloth: 60% less VRAM [--] longer context. https://docs.unsloth.ai/new/fp8-reinforcement-learning https://docs.unsloth.ai/new/fp8-reinforcement-learning https://docs.unsloth.ai/new/fp8-reinforcement-learning https://docs.unsloth.ai/new/fp8-reinforcement-learning"
X Link 2025-11-25T16:37Z 38K followers, 144.6K engagements
"You can now do 500K context length fine-tuning with Unsloth Train gpt-oss-20b to extend its context window to 530K on 80GB VRAM & 750K+ on 192GB - no accuracy loss. Unsloth's new algorithms + Tiled MLP = 72% less VRAM & 6x more context Blog + Notebook: https://docs.unsloth.ai/new/500k-context-length-fine-tuning https://docs.unsloth.ai/new/500k-context-length-fine-tuning https://docs.unsloth.ai/new/500k-context-length-fine-tuning https://docs.unsloth.ai/new/500k-context-length-fine-tuning"
X Link 2025-12-01T14:45Z 37.7K followers, 41K engagements
"To clarify yes this release supports any LLM or VLM not just gpt-oss - with limited RL support as well. :) More details in our blogpost"
X Link 2025-12-01T15:27Z 35.2K followers, [----] engagements
"Mistral releases Ministral [--] their new reasoning and instruct models π₯ Ministral [--] comes in 3B 8B and 14B with vision support and best-in-class performance. Run the 14B models locally with 24GB RAM. Guide + Notebook: GGUFs: https://huggingface.co/collections/unsloth/ministral-3 https://docs.unsloth.ai/new/ministral-3 Introducing the Mistral [--] family of models: Frontier intelligence at all sizes. Apache [---]. Details in π§΅ https://t.co/lsrDmhW78u https://huggingface.co/collections/unsloth/ministral-3 https://docs.unsloth.ai/new/ministral-3 Introducing the Mistral [--] family of models: Frontier"
X Link 2025-12-02T15:17Z 40.3K followers, 81.6K engagements
"@Alibaba_Qwen Let's gooo Qwen & open-source ππ¦₯"
X Link 2025-12-04T08:36Z 35.3K followers, [---] engagements
"You can now train Mistral Ministral [--] with reinforcement learning in our free notebook You'll GRPO the model to solve sudoku autonomously. Learn about our new reward functions RL environment & reward hacking. Blog: Notebook: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Ministral_3_(3B)_Reinforcement_Learning_Sudoku_Game.ipynb https://docs.unsloth.ai/new/ministral-3 https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Ministral_3_(3B)_Reinforcement_Learning_Sudoku_Game.ipynb https://docs.unsloth.ai/new/ministral-3"
X Link 2025-12-04T15:01Z 37.7K followers, 41K engagements
"NVIDIA releases Nemotron [--] Nano a new 30B hybrid reasoning model π₯ Nemotron [--] has a 1M context window and the best in class performance for SWE-Bench reasoning and chat. Run the MoE model locally with 24GB RAM. Guide: GGUF: https://huggingface.co/unsloth/Nemotron-3-Nano-30B-A3B-GGUF https://docs.unsloth.ai/models/nemotron-3 https://huggingface.co/unsloth/Nemotron-3-Nano-30B-A3B-GGUF https://docs.unsloth.ai/models/nemotron-3"
X Link 2025-12-15T14:07Z 38.5K followers, 138.3K engagements
"We teamed up with @NVIDIA and @MatthewBerman to teach you how to do Reinforcement Learning Learn about: - RL environments reward functions & reward hacking - Training OpenAI gpt-oss to automatically solve [----] - Local Windows training with @NVIDIA_AI_PC RTX GPUs - How RLVR (verifiable rewards) works - How to interpret RL metrics like KL Divergence Full video tutorial: https://www.youtube.com/watchv=9t-BAjzBWj8 https://www.youtube.com/watchv=9t-BAjzBWj8"
X Link 2025-12-16T14:31Z 38.5K followers, 51.9K engagements
"Google releases FunctionGemma a new 270M parameter model that runs on just [---] GB RAM.β¨ Built for tool-calling run locally on your phone at 50+ tokens/s or fine-tune with Unsloth & deploy to your phone. Docs + Notebook: GGUF: https://huggingface.co/unsloth/functiongemma-270m-it-GGUF https://docs.unsloth.ai/models/functiongemma https://huggingface.co/unsloth/functiongemma-270m-it-GGUF https://docs.unsloth.ai/models/functiongemma Introducing FunctionGemma π€270m model for function calling π±can run in your phone browser or other devices π€designed to be specialized for your own tasks"
X Link 2025-12-18T17:22Z 38.2K followers, 219.4K engagements
"@Alibaba_Qwen Congrats guys this is an amazing open-source effort ππ₯° We made Qwen-Image-Edit-2511 GGUFs so everyone can run it locally π https://huggingface.co/unsloth/Qwen-Image-Edit-2511-GGUF https://huggingface.co/unsloth/Qwen-Image-Edit-2511-GGUF"
X Link 2025-12-23T16:17Z 38.5K followers, 32.5K engagements
"@NVIDIAAIDev Thanks guys for the constant support π¦₯π"
X Link 2025-12-23T23:29Z 38.1K followers, [---] engagements
"Merry Christmas from Unsloth ππ Thank you for all the support this year Were excited to keep shipping open-source next year π₯°"
X Link 2025-12-24T14:53Z 38.5K followers, [----] engagements
"We just crossed [-----] stars on GitHub π¦₯ Huge thanks to you every contributor and our amazing community for all your support. Our GitHub repo: https://github.com/unslothai/unsloth https://github.com/unslothai/unsloth"
X Link 2025-12-30T14:32Z 38.6K followers, 19.9K engagements
"@Alibaba_Qwen [----] was so amazing because of Qwen We're super excited for Qwen4 in [----] ππ₯°"
X Link 2025-12-31T04:47Z 38.6K followers, [----] engagements
"@Alibaba_Qwen Thanks guys for the support and day zero access We're excited for more Qwen in [----] ππΈ"
X Link 2025-12-31T10:13Z 38.6K followers, [----] engagements
"We made a guide on how to run Qwen-Image diffusion models locally Learn to: Run Qwen-Image-2512 and Edit-2511 Use GGUF FP8 in ComfyUI stable-diffusion.cpp diffusers Create workflows & prompts Adjust hyperparams (sampling guidance) Guide: https://unsloth.ai/docs/models/qwen-image-2512 https://unsloth.ai/docs/models/qwen-image-2512 https://unsloth.ai/docs/models/qwen-image-2512 https://unsloth.ai/docs/models/qwen-image-2512"
X Link 2026-01-08T14:37Z 39.7K followers, 30.8K engagements
"@Zai_org Thank you guys for this amazing release You can now run & fine-tune the model locally: https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF"
X Link 2026-01-20T05:13Z 39.8K followers, [----] engagements
"Update: For improved performance please use: --dry-multiplier [---] --temp [---] --top-k [--] --top-p [----] --min-p [----] which should reduce any looping or incorrect output issues. π --dry-multiplier [---] especially works well. For more information see: https://unsloth.ai/docs/models/glm-4.7-flash#reducing-repetition-and-looping https://unsloth.ai/docs/models/glm-4.7-flash#reducing-repetition-and-looping https://unsloth.ai/docs/models/glm-4.7-flash#reducing-repetition-and-looping https://unsloth.ai/docs/models/glm-4.7-flash#reducing-repetition-and-looping"
X Link 2026-01-20T07:26Z 39.3K followers, [----] engagements
"@Zai_org Congrats guys on release & thank you for supporting open-source π π₯° We uploaded GLM-5 GGUFs so people can run it locally: https://huggingface.co/unsloth/GLM-5-GGUF https://huggingface.co/unsloth/GLM-5-GGUF"
X Link 2026-02-11T19:20Z 43.5K followers, 16.6K engagements
"You can now run GLM-5 locallyπ₯ GLM-5 is a new open SOTA agentic coding & chat LLM with 200K context. We shrank the 744B model from 1.65TB to 241GB (-85%) via Dynamic 2-bit. Runs on a 256GB Mac or RAM/VRAM setups. Guide: GGUF: https://huggingface.co/unsloth/GLM-5-GGUF https://unsloth.ai/docs/models/glm-5 Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5 it scales from 355B params (32B active) to 744B (40B active) with pre-training data growing from 23T to 28.5T tokens."
X Link 2026-02-12T12:55Z 43.5K followers, 225.2K engagements
"Run DeepSeek-V3.1 locally on 170GB RAM with Dynamic 1-bit GGUFsπ The 715GB model gets reduced to 170GB (-80% size) by smartly quantizing layers. The 1-bit GGUF passes all our code tests & we fixed the chat template Guide: GGUF: https://huggingface.co/unsloth/DeepSeek-V3.1-GGUF https://docs.unsloth.ai/basics/deepseek-v3.1 https://huggingface.co/unsloth/DeepSeek-V3.1-GGUF https://docs.unsloth.ai/basics/deepseek-v3.1"
X Link 2025-08-22T19:50Z 43.3K followers, 60.3K engagements
"OpenAI gpt-oss with ultra long context is hereπ Introducing Unsloth Flex Attention which enables 61K context for gpt-oss bf16 training on a 80GB GPU. Unsloth achieves 8longer context 50% less VRAM & 1.5faster training vs. all implementations. https://docs.unsloth.ai/basics/long-context-gpt-oss-training https://docs.unsloth.ai/basics/long-context-gpt-oss-training"
X Link 2025-08-28T16:48Z 43.3K followers, 142.4K engagements
"Unsloth now has a Docker image π³ Train LLMs locally with no setup: just run the image and go. Includes every pre-made Unsloth notebook. Solves dependency or environment issues. Guide: https://docs.unsloth.ai/new/how-to-train-llms-with-unsloth-and-docker https://docs.unsloth.ai/new/how-to-train-llms-with-unsloth-and-docker"
X Link 2025-10-01T13:42Z 43.2K followers, 97.1K engagements
"You can now fine-tune LLMs and deploy them directly on your phone π We collabed with PyTorch so you can export and run your trained model 100% locally on your iOS or Android device. Deploy Qwen3 on Pixel [--] and iPhone [--] Pro at [--] tokens/sec. Guide: https://docs.unsloth.ai/new/deploy-llms-phone https://docs.unsloth.ai/new/deploy-llms-phone https://docs.unsloth.ai/new/deploy-llms-phone https://docs.unsloth.ai/new/deploy-llms-phone"
X Link 2025-12-17T14:55Z 41.6K followers, 136.6K engagements
"NVIDIA made a beginner's guide to fine-tuning LLMs with Unsloth π You'll learn about: - Training methods: LoRA FFT RL - When to fine-tune and why + use-cases - Amount of data and VRAM needed - How to train locally on DGX Spark RTX GPUs & more Guide: https://blogs.nvidia.com/blog/rtx-ai-garage-fine-tuning-unsloth-dgx-spark/ https://blogs.nvidia.com/blog/rtx-ai-garage-fine-tuning-unsloth-dgx-spark/"
X Link 2025-12-22T13:42Z 42.3K followers, 140K engagements
"You can now fine-tune LLMs with Unsloth then deploy them in @LMStudio π¦₯πΎ We made a free notebook to fine-tune FunctionGemma (270M) so it thinks before calling tools then export the model to GGUF for deployment in LM Studio. Notebook: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/FunctionGemma_(270M)-LMStudio.ipynb We worked with @UnslothAI on a new beginner's guide: How to fine-tune FunctionGemma and run it locally π§ Train FunctionGemma for custom tool calls β¨ Convert it to GGUF + import into LM Studio πΎ Serve it locally and use it in your code Step-by-step"
X Link 2025-12-23T15:51Z 41.7K followers, 60.7K engagements
"Qwen releases Qwen-Image-2512 a new SOTA text-to-image model. π It's the top performing open diffusion model on AI Arena and has more realistic + accurate images/text. Run locally with 14GB RAM via our Dynamic GGUF Guide: GGUF: https://huggingface.co/unsloth/Qwen-Image-2512-GGUF https://unsloth.ai/docs/models/qwen-image-2512 πANewYeargiftfromQwenQwen-Image-2512ishere. πOurDecemberupgradetoQwen-ImagejustintimefortheNewYear. β¨Whatsnew: MorerealistichumansdramaticallyreducedAIlookricherfacialdetails Finernaturaltexturessharperlandscapeswater https://t.co/8X6AVcJCIG"
X Link 2025-12-31T09:34Z 43.2K followers, 120.2K engagements
"You can now do reinforcement learning training with [--] longer context and no accuracy loss via our new batching algorithms. Long reasoning chains in RL are costly but now we enable you to train gpt-oss with GRPO & reach 380K context on a 192GB GPU. https://unsloth.ai/docs/new/grpo-long-context https://unsloth.ai/docs/new/grpo-long-context https://unsloth.ai/docs/new/grpo-long-context https://unsloth.ai/docs/new/grpo-long-context"
X Link 2026-01-15T15:47Z 41.6K followers, 71.9K engagements
"You can now fine-tune embedding models in our free notebook Improve retrieval and RAG with better semantic search & similarity. Unsloth trains 2x faster 20% less VRAM 2x context & no accuracy loss Blog: EmbeddingGemma (300M): https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/EmbeddingGemma_(300M).ipynb https://unsloth.ai/docs/new/embedding-finetuning https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/EmbeddingGemma_(300M).ipynb https://unsloth.ai/docs/new/embedding-finetuning"
X Link 2026-01-22T16:08Z 43.5K followers, 81.5K engagements
"Unsloth is excited to support @HuggingFace Transformers v5 π€π¦₯ Get all the latest performance improvements in inference training and more Transformers v5's FINAL stable release is out π₯ Transformers' biggest release. The big Ws of this release: - Performance especially for MoE (6x-11x speedups) - No more slow/fast tokenizers - way simpler API explicit backends better performance - dynamic weight loading: way https://t.co/PV9lmE3KJx Transformers v5's FINAL stable release is out π₯ Transformers' biggest release. The big Ws of this release: - Performance especially for MoE (6x-11x speedups) -"
X Link 2026-01-26T23:50Z 43.5K followers, 21.6K engagements
"DeepSeek releases DeepSeek-OCR [--]. π The new 3B model achieves SOTA visual document and OCR understanding. DeepEncoder V2 is introduced which enables the model scan images in same logical order as humans boosting OCR accuracy. Instead of traditional vision LLMs which read an image in a fixed grid (top-left bottom-right) DeepEncoder V2 first builds a global understanding then learns a human-like reading order - what to attend to first next and so on. This improves OCR on complex layouts helping it follow columns link labels to values read tables coherently and handle mixed text + structure"
X Link 2026-01-27T06:09Z 43.5K followers, 222.8K engagements
"@Kimi_Moonshot Congrats guys & thank you for this amazing open release π We're working on Dynamic GGUFs so you guys can run it locally: https://huggingface.co/unsloth/Kimi-K2.5-GGUF https://huggingface.co/unsloth/Kimi-K2.5-GGUF"
X Link 2026-01-27T07:20Z 43.3K followers, 101.9K engagements
"For tutorials on how to Run & Fine-tune DeepSeek-OCR [--] you can read our guide: Inference & training for the model is already supported in Unsloth. https://unsloth.ai/docs/models/deepseek-ocr-2 https://unsloth.ai/docs/models/deepseek-ocr-2"
X Link 2026-01-27T09:13Z 43.5K followers, [----] engagements
"Note that VRAM is not required. You can run on a Mac with 256GB unified memory with similar speeds or [---] RAM without VRAM. You can even run with much less compute (e.g. 80GB RAM) as it'll offload but it'll be slower. https://twitter.com/i/web/status/2016532064955191619 https://twitter.com/i/web/status/2016532064955191619"
X Link 2026-01-28T15:21Z 43.5K followers, 16.5K engagements
"@Alibaba_Qwen Thank you so much for releasing an open-source LLM for fast and smart coding π₯° We made GGUFs so you can run Qwen3-Coder-Next locally on 46GB RAM or less: https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF"
X Link 2026-02-03T16:21Z 43.4K followers, 23.4K engagements
"@Alibaba_Qwen We're super excited for more Qwen models this year ππ₯° Let's go open-source"
X Link 2026-02-03T17:16Z 43.3K followers, [----] engagements
"@NVIDIAAIDev @huggingface Congrats guys thank you Nvidia team for releasing brilliant open-source models ππ"
X Link 2026-02-04T05:44Z 43.3K followers, [----] engagements
"We made a guide on how to do tool calling with local LLMs. Learn how to use open models like Qwen3-Coder-Next and GLM-4.7-Flash for function calling. Has hands-on examples for: story writing Python execution terminal tool calls maths and more. Guide: https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms"
X Link 2026-02-05T15:57Z 42.2K followers, [---] engagements
"@Zai_org Congrats guys GLM-4.7-Flash is actually one of the most popular models we've ever seen π₯π https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF"
X Link 2026-02-10T13:09Z 43.3K followers, [----] engagements
"GLM-4.7-Flash GGUFs now produce significantly better outputs after recent llama.cpp bug fixes. We reconverted and updated the GGUFs. Run 4-bit locally on 18GB RAM. To get fixes re-download & use inference parameters by @Zai_org. Updated GGUFs: https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF You can now run GLM-4.7-Flash locally on your deviceπ₯ GLM-4.7-Flash is the best performing 30B model on SWE-Bench and GPQA. With 200K context it excels at coding agents chat & reasoning. Run local with 24GB RAM. Guide: https://t.co/SpJxl00VIa GGUF: https://t.co/aTuUxu32z3 https://t.co/3MwNRe3iva"
X Link 2026-01-21T13:28Z 43.5K followers, 153K engagements
"You can now train LLMs [--] faster with no accuracy loss via our new RoPE and MLP kernels. Our Triton kernels plus smart auto packing delivers [--] faster training & 30% less VRAM vs optimized FA3 setups. Train Qwen3-4B 3x faster on just 3.9GB VRAM. Blog: https://docs.unsloth.ai/new/3x-faster-training-packing https://docs.unsloth.ai/new/3x-faster-training-packing"
X Link 2025-12-10T14:41Z 43.5K followers, 627.7K engagements
"You can now run GLM-4.7-Flash locally on your deviceπ₯ GLM-4.7-Flash is the best performing 30B model on SWE-Bench and GPQA. With 200K context it excels at coding agents chat & reasoning. Run local with 24GB RAM. Guide: GGUF: https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF https://unsloth.ai/docs/models/glm-4.7-flash Introducing GLM-4.7-Flash: Your local coding and agentic assistant. Setting a new standard for the 30B class GLM-4.7-Flash balances high performance with efficiency making it the perfect lightweight deployment option. Beyond coding it is also recommended for creative writing"
X Link 2026-01-20T05:22Z 43.5K followers, 335.5K engagements
"You can now run Kimi K2.5 locally π₯ We shrank the 1T model to 240GB (-60%) via Dynamic 1-bit. Run at [--] tok/s on 240GB VRAM/RAM. 2-bit is recommended as it passes our code tests. Run near full precision on 622GB. Guide: GGUF: https://huggingface.co/unsloth/Kimi-K2.5-GGUF https://unsloth.ai/docs/models/kimi-k2.5 π₯ Meet Kimi K2.5 Open-Source Visual Agentic Intelligence. πΉ Global SOTA on Agentic Benchmarks: HLE full set (50.2%) BrowseComp (74.9%) πΉ Open-source SOTA on Vision and Coding: MMMU Pro (78.5%) VideoMMMU (86.6%) SWE-bench Verified (76.8%) πΉ Code with Taste: turn chats"
X Link 2026-01-28T13:59Z 43.5K followers, 464.4K engagements
"We successfully trained an LLM without human intervention using Claude Code. We made a guide on how to do this with local LLMs via Claude Code and OpenAI Codex. Connect GLM-4.7-Flash to your server and start agentic coding locally Guide: https://unsloth.ai/docs/basics/claude-codex https://unsloth.ai/docs/basics/claude-codex"
X Link 2026-01-29T15:50Z 43.5K followers, 138.1K engagements
"Qwen releases Qwen3-Coder-Next. π The new 80B MoE model excels at agentic coding & local use. With 256K context it delivers similar performance to models with 10-20 more active parameters. Run on 46GB RAM or less. Guide: GGUF: https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF https://unsloth.ai/docs/models/qwen3-coder-next π IntroducingQwen3-Coder-Next an open-weight LM built for coding agents & local development. Whats new: π€ Scaling agentic training:800K verifiable tasks + executable envs π EfficiencyPerformance Tradeoff: achieves strong results on SWE-Bench Pro with 80B total params"
X Link 2026-02-03T16:11Z 43.5K followers, 239.1K engagements
"We created a tool-calling guide for local LLMs Learn how to use any open model like Qwen3-Coder-Next and GLM-4.7-Flash for function calling. We provide hands-on examples for: story writing Python execution terminal tool calls maths and more. Guide: https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms"
X Link 2026-02-05T16:04Z 43.5K followers, 46.7K engagements
"You can now train MoE models [--] faster with 35% less VRAM via our new Triton kernels (no accuracy loss). Train gpt-oss locally on 12.8GB VRAM. In collab with @HuggingFace Unsloth trains DeepSeek Qwen3 GLM faster. Repo: Blog: https://unsloth.ai/docs/new/faster-moe https://github.com/unslothai/unsloth https://unsloth.ai/docs/new/faster-moe https://github.com/unslothai/unsloth"
X Link 2026-02-10T15:25Z 43.5K followers, 208.9K engagements
"You can now run MiniMax-2.5 locally π At 230B parameters MiniMax-2.5 is the strongest LLM under 700B params delivering SOTA agentic coding & chat. Run Dynamic 3/4-bit on a 128GB Mac for [--] tokens/s. Guide: GGUF: https://huggingface.co/unsloth/MiniMax-M2.5-GGUF https://unsloth.ai/docs/models/minimax-2.5 Introducing M2.5 an open-source frontier model designed for real-world productivity. - SOTA performance at coding (SWE-Bench Verified 80.2%) search (BrowseComp 76.3%) agentic tool-calling (BFCL 76.8%) & office work. - Optimized for efficient execution 37% faster at complex"
X Link 2026-02-15T13:41Z 43.5K followers, 125.3K engagements
"You can now train LLMs [--] faster with no accuracy loss via our new RoPE and MLP kernels. Our Triton kernels plus smart auto packing delivers [--] faster training & 30% less VRAM vs optimized FA3 setups. Train Qwen3-4B 3x faster on just 3.9GB VRAM. Blog: https://docs.unsloth.ai/new/3x-faster-training-packing https://docs.unsloth.ai/new/3x-faster-training-packing"
X Link 2025-12-10T14:41Z 43.5K followers, 627.7K engagements
"You can now run MiniMax-2.5 locally π At 230B parameters MiniMax-2.5 is the strongest LLM under 700B params delivering SOTA agentic coding & chat. Run Dynamic 3/4-bit on a 128GB Mac for [--] tokens/s. Guide: GGUF: https://huggingface.co/unsloth/MiniMax-M2.5-GGUF https://unsloth.ai/docs/models/minimax-2.5 Introducing M2.5 an open-source frontier model designed for real-world productivity. - SOTA performance at coding (SWE-Bench Verified 80.2%) search (BrowseComp 76.3%) agentic tool-calling (BFCL 76.8%) & office work. - Optimized for efficient execution 37% faster at complex"
X Link 2026-02-15T13:41Z 43.5K followers, 125.3K engagements
"Introducing M2.5 an open-source frontier model designed for real-world productivity. - SOTA performance at coding (SWE-Bench Verified 80.2%) search (BrowseComp 76.3%) agentic tool-calling (BFCL 76.8%) & office work. - Optimized for efficient execution 37% faster at complex tasks. - At $1 per hour with [---] tps infinite scaling of long-horizon agents now economically possible MiniMax Agent: API: CodingPlan: http://platform.minimax.io/subscribe/coding-plan http://platform.minimax.io http://agent.minimax.io http://platform.minimax.io/subscribe/coding-plan http://platform.minimax.io"
X Link 2026-02-12T16:12Z 61.5K followers, 5.1M engagements
"You can now run GLM-5 locallyπ₯ GLM-5 is a new open SOTA agentic coding & chat LLM with 200K context. We shrank the 744B model from 1.65TB to 241GB (-85%) via Dynamic 2-bit. Runs on a 256GB Mac or RAM/VRAM setups. Guide: GGUF: https://huggingface.co/unsloth/GLM-5-GGUF https://unsloth.ai/docs/models/glm-5 Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5 it scales from 355B params (32B active) to 744B (40B active) with pre-training data growing from 23T to 28.5T tokens."
X Link 2026-02-12T12:55Z 43.5K followers, 225.2K engagements
"Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5 it scales from 355B params (32B active) to 744B (40B active) with pre-training data growing from 23T to 28.5T tokens. Try it now: Weights: Tech Blog: OpenRouter (Previously Pony Alpha): Rolling out from Coding Plan Max users: http://z.ai/subscribe http://openrouter.ai/z-ai/glm-5 http://z.ai/blog/glm-5 http://huggingface.co/zai-org/GLM-5 http://chat.z.ai http://z.ai/subscribe http://openrouter.ai/z-ai/glm-5 http://z.ai/blog/glm-5"
X Link 2026-02-11T17:33Z 50.9K followers, 1.4M engagements
"tuning open weight models on colab is hands down biggest educational unlock out there. It sets students hackers devs anyone off on a rabbit hole journey of tinkering with their own models. for me what @UnslothAI have done with moe training is both a technical and educational wonder. that we just need to soak in. you can take @OpenAIDevs gpt-oss-20b and fine tune it for free in a few hours. https://twitter.com/i/web/status/2021610578138054773 https://twitter.com/i/web/status/2021610578138054773"
X Link 2026-02-11T15:41Z [----] followers, [----] engagements
"RT @NVIDIAAIDev: This is an incredible performance breakthrough from @UnslothAI. 12x faster fine-tuning 35% less VRAM all with no loss i"
X Link 2026-02-11T01:45Z 43.5K followers, [---] engagements
"This is an incredible performance breakthrough from @UnslothAI. 12x faster fine-tuning 35% less VRAM all with no loss in accuracy enables fine-tuning of MoE models like gpt-oss-20b on just [--] GB of VRAM. You can now train MoE models [--] faster with 35% less VRAM via our new Triton kernels (no accuracy loss). Train gpt-oss locally on 12.8GB VRAM. In collab with @HuggingFace Unsloth trains DeepSeek Qwen3 GLM faster. Repo: https://t.co/aZWYAtakBP Blog: https://t.co/3wRiBxVJB6 https://t.co/MZke9gtISU You can now train MoE models [--] faster with 35% less VRAM via our new Triton kernels (no accuracy"
X Link 2026-02-11T01:37Z 91.8K followers, 127.7K engagements
"You can now train MoE models [--] faster with 35% less VRAM via our new Triton kernels (no accuracy loss). Train gpt-oss locally on 12.8GB VRAM. In collab with @HuggingFace Unsloth trains DeepSeek Qwen3 GLM faster. Repo: Blog: https://unsloth.ai/docs/new/faster-moe https://github.com/unslothai/unsloth https://unsloth.ai/docs/new/faster-moe https://github.com/unslothai/unsloth"
X Link 2026-02-10T15:25Z 43.5K followers, 208.9K engagements
"GLM-4.7-Flash-GGUF is now the most downloaded model on @UnslothAI"
X Link 2026-02-10T12:59Z 50.9K followers, 56.4K engagements
"We created a tool-calling guide for local LLMs Learn how to use any open model like Qwen3-Coder-Next and GLM-4.7-Flash for function calling. We provide hands-on examples for: story writing Python execution terminal tool calls maths and more. Guide: https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms"
X Link 2026-02-05T16:04Z 43.5K followers, 46.7K engagements
"RT @Alibaba_Qwen: πππππ Thanks for the support from day 0"
X Link 2026-02-03T17:00Z 43.5K followers, [--] engagements
"πππππ Thanks for the support from day [--] Qwen releases Qwen3-Coder-Next. π The new 80B MoE model excels at agentic coding & local use. With 256K context it delivers similar performance to models with 10-20 more active parameters. Run on 46GB RAM or less. Guide: https://t.co/kFrY9qi5co GGUF: https://t.co/J6Eb8c1nKO https://t.co/nBeplo3cdG Qwen releases Qwen3-Coder-Next. π The new 80B MoE model excels at agentic coding & local use. With 256K context it delivers similar performance to models with 10-20 more active parameters. Run on 46GB RAM or less. Guide: https://t.co/kFrY9qi5co GGUF:"
X Link 2026-02-03T16:58Z 142.4K followers, 30.8K engagements
"Qwen releases Qwen3-Coder-Next. π The new 80B MoE model excels at agentic coding & local use. With 256K context it delivers similar performance to models with 10-20 more active parameters. Run on 46GB RAM or less. Guide: GGUF: https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF https://unsloth.ai/docs/models/qwen3-coder-next π IntroducingQwen3-Coder-Next an open-weight LM built for coding agents & local development. Whats new: π€ Scaling agentic training:800K verifiable tasks + executable envs π EfficiencyPerformance Tradeoff: achieves strong results on SWE-Bench Pro with 80B total params"
X Link 2026-02-03T16:11Z 43.5K followers, 239.1K engagements
"π IntroducingQwen3-Coder-Next an open-weight LM built for coding agents & local development. Whats new: π€ Scaling agentic training:800K verifiable tasks + executable envs π EfficiencyPerformance Tradeoff: achieves strong results on SWE-Bench Pro with 80B total params and 3B active β¨SupportsOpenClaw Qwen Code Claude Code web dev browser use Cline etc π€ Hugging Face: π€ ModelScope: π Blog: π Tech https://github.com/QwenLM/Qwen3-Coder/blob/main/qwen3_coder_next_tech_report.pdf https://qwen.ai/blogid=qwen3-coder-next https://modelscope.cn/collections/Qwen/Qwen3-Coder-Next"
X Link 2026-02-03T16:09Z 142.4K followers, 1.5M engagements
"We successfully trained an LLM without human intervention using Claude Code. We made a guide on how to do this with local LLMs via Claude Code and OpenAI Codex. Connect GLM-4.7-Flash to your server and start agentic coding locally Guide: https://unsloth.ai/docs/basics/claude-codex https://unsloth.ai/docs/basics/claude-codex"
X Link 2026-01-29T15:50Z 43.5K followers, 138.1K engagements
"You can now run Kimi K2.5 locally π₯ We shrank the 1T model to 240GB (-60%) via Dynamic 1-bit. Run at [--] tok/s on 240GB VRAM/RAM. 2-bit is recommended as it passes our code tests. Run near full precision on 622GB. Guide: GGUF: https://huggingface.co/unsloth/Kimi-K2.5-GGUF https://unsloth.ai/docs/models/kimi-k2.5 π₯ Meet Kimi K2.5 Open-Source Visual Agentic Intelligence. πΉ Global SOTA on Agentic Benchmarks: HLE full set (50.2%) BrowseComp (74.9%) πΉ Open-source SOTA on Vision and Coding: MMMU Pro (78.5%) VideoMMMU (86.6%) SWE-bench Verified (76.8%) πΉ Code with Taste: turn chats"
X Link 2026-01-28T13:59Z 43.5K followers, 464.4K engagements
"π₯ Meet Kimi K2.5 Open-Source Visual Agentic Intelligence. πΉ Global SOTA on Agentic Benchmarks: HLE full set (50.2%) BrowseComp (74.9%) πΉ Open-source SOTA on Vision and Coding: MMMU Pro (78.5%) VideoMMMU (86.6%) SWE-bench Verified (76.8%) πΉ Code with Taste: turn chats images & videos into aesthetic websites with expressive motion. πΉ Agent Swarm (Beta): self-directed agents working in parallel at scale. Up to [---] sub-agents [----] tool calls [---] faster compared with single-agent setup. - π₯ K2.5 is now live on in chat mode and agent mode. π₯ K2.5 Agent Swarm in beta for high-tier users. π₯"
X Link 2026-01-27T05:42Z 112.1K followers, 7.2M engagements
"Note that VRAM is not required. You can run on a Mac with 256GB unified memory with similar speeds or [---] RAM without VRAM. You can even run with much less compute (e.g. 80GB RAM) as it'll offload but it'll be slower. https://twitter.com/i/web/status/2016532064955191619 https://twitter.com/i/web/status/2016532064955191619"
X Link 2026-01-28T15:21Z 43.5K followers, 16.5K engagements
"DeepSeek releases DeepSeek-OCR [--]. π The new 3B model achieves SOTA visual document and OCR understanding. DeepEncoder V2 is introduced which enables the model scan images in same logical order as humans boosting OCR accuracy. Instead of traditional vision LLMs which read an image in a fixed grid (top-left bottom-right) DeepEncoder V2 first builds a global understanding then learns a human-like reading order - what to attend to first next and so on. This improves OCR on complex layouts helping it follow columns link labels to values read tables coherently and handle mixed text + structure"
X Link 2026-01-27T06:09Z 43.5K followers, 222.8K engagements
"For tutorials on how to Run & Fine-tune DeepSeek-OCR [--] you can read our guide: Inference & training for the model is already supported in Unsloth. https://unsloth.ai/docs/models/deepseek-ocr-2 https://unsloth.ai/docs/models/deepseek-ocr-2"
X Link 2026-01-27T09:13Z 43.5K followers, [----] engagements
"Unsloth is excited to support @HuggingFace Transformers v5 π€π¦₯ Get all the latest performance improvements in inference training and more Transformers v5's FINAL stable release is out π₯ Transformers' biggest release. The big Ws of this release: - Performance especially for MoE (6x-11x speedups) - No more slow/fast tokenizers - way simpler API explicit backends better performance - dynamic weight loading: way https://t.co/PV9lmE3KJx Transformers v5's FINAL stable release is out π₯ Transformers' biggest release. The big Ws of this release: - Performance especially for MoE (6x-11x speedups) -"
X Link 2026-01-26T23:50Z 43.5K followers, 21.6K engagements
"Transformers v5's FINAL stable release is out π₯ Transformers' biggest release. The big Ws of this release: - Performance especially for MoE (6x-11x speedups) - No more slow/fast tokenizers - way simpler API explicit backends better performance - dynamic weight loading: way faster and enabling: MoE now working w/ quants tp peft . We have a migration guide on the main branch; please take a look at it in case you run into issues. Come in our GH issues if you still do after reading it π https://twitter.com/i/web/status/2015802366730395764 https://twitter.com/i/web/status/2015802366730395764"
X Link 2026-01-26T15:01Z 10.7K followers, 74.4K engagements
"Sentence Transformers π€ @UnslothAI We've collaborated with the fine folks at @UnslothAI to make your embedding model finetuning 2x faster and require 20% less VRAM The Unsloth team prepared [--] notebooks showing how you can take advantage of it π§΅"
X Link 2026-01-22T17:34Z [----] followers, 15K engagements
"You can now fine-tune embedding models in our free notebook Improve retrieval and RAG with better semantic search & similarity. Unsloth trains 2x faster 20% less VRAM 2x context & no accuracy loss Blog: EmbeddingGemma (300M): https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/EmbeddingGemma_(300M).ipynb https://unsloth.ai/docs/new/embedding-finetuning https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/EmbeddingGemma_(300M).ipynb https://unsloth.ai/docs/new/embedding-finetuning"
X Link 2026-01-22T16:08Z 43.5K followers, 81.5K engagements
Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing
/creator/twitter::UnslothAI