Dark | Light
# ![@Xianbao_QIAN Avatar](https://lunarcrush.com/gi/w:26/cr:twitter::1597257798068637697.png) @Xianbao_QIAN Tiezhen WANG

Tiezhen WANG posts on X about llm, ai, demo, moe the most. They currently have [-----] followers and [---] posts still getting attention that total [------] engagements in the last [--] hours.

### Engagements: [------] [#](/creator/twitter::1597257798068637697/interactions)
![Engagements Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1597257798068637697/c:line/m:interactions.svg)

- [--] Week [-------] +632%
- [--] Month [-------] +7.30%
- [--] Months [---------] +218%
- [--] Year [---------] +246%

### Mentions: [--] [#](/creator/twitter::1597257798068637697/posts_active)
![Mentions Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1597257798068637697/c:line/m:posts_active.svg)

- [--] Week [--] -7.70%
- [--] Month [--] +48%
- [--] Months [---] +36%
- [--] Year [---] +84%

### Followers: [-----] [#](/creator/twitter::1597257798068637697/followers)
![Followers Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1597257798068637697/c:line/m:followers.svg)

- [--] Week [-----] +2.90%
- [--] Month [-----] +5%
- [--] Months [-----] +47%
- [--] Year [-----] +118%

### CreatorRank: [-------] [#](/creator/twitter::1597257798068637697/influencer_rank)
![CreatorRank Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1597257798068637697/c:line/m:influencer_rank.svg)

### Social Influence

**Social category influence**
[technology brands](/list/technology-brands)  [finance](/list/finance)  [stocks](/list/stocks)  [countries](/list/countries)  [social networks](/list/social-networks)  [celebrities](/list/celebrities)  [travel destinations](/list/travel-destinations)  [exchanges](/list/exchanges)  [automotive brands](/list/automotive-brands)  [cryptocurrencies](/list/cryptocurrencies) 

**Social topic influence**
[llm](/topic/llm), [ai](/topic/ai), [demo](/topic/demo), [moe](/topic/moe), [performance](/topic/performance), [release](/topic/release), [the new](/topic/the-new), [up to](/topic/up-to), [in the](/topic/in-the), [baidu](/topic/baidu)

**Top accounts mentioned or mentioned by**
[@huggingface](/creator/undefined) [@alibabaqwen](/creator/undefined) [@deepseekai](/creator/undefined) [@zaiorg](/creator/undefined) [@01aiyi](/creator/undefined) [@justinlin610](/creator/undefined) [@alibabagroup](/creator/undefined) [@tencenthunyuan](/creator/undefined) [@osanseviero](/creator/undefined) [@huybery](/creator/undefined) [@thukeg](/creator/undefined) [@txhunyuan](/creator/undefined) [@kaifulee](/creator/undefined) [@parrynee](/creator/undefined) [@richardllin](/creator/undefined) [@senseyewinning](/creator/undefined) [@chatglm](/creator/undefined) [@kimimoonshot](/creator/undefined) [@meituanlongcat](/creator/undefined) [@internlm](/creator/undefined)

**Top assets mentioned**
[Microsoft Corp. (MSFT)](/topic/microsoft) [Flux (FLUX)](/topic/flux)
### Top Social Posts
Top posts by engagements in the last [--] hours

"History in the makingπŸ“– Join the first ever Chinese LLM novel writing competition. Let's explore the boundaries of large language models and see what stories they can spin. Bonus gift for using open source models. πŸ“· Details below. #AIWriting #LLM https://mp.weixin.qq.com/s/9sNOrolEC34OxAZXKFBP_Q https://mp.weixin.qq.com/s/9sNOrolEC34OxAZXKFBP_Q"  
[X Link](https://x.com/Xianbao_QIAN/status/1690357299607719936)  2023-08-12T13:39Z [----] followers, [---] engagements


"Want to know what's involved in training the code model Check this presentation from Loubna :-) Hint: it's just about burning GPUs"  
[X Link](https://x.com/Xianbao_QIAN/status/1714634363726332275)  2023-10-18T13:27Z [---] followers, [--] engagements


"Breaking: Beijing Court makes a groundbreaking ruling on AI-generated art 🎨 - dispute over an image created using Stable Diffusion - creation process: prompts parameters are considered as IP. - The human user who creatively engages with AI holds the copyright"  
[X Link](https://x.com/Xianbao_QIAN/status/1739998491190104127)  2023-12-27T13:15Z [---] followers, [---] engagements


"πŸ“£πŸŽ¨ Breaking: Beijing court declares AI-generated content copyrightable Landmark ruling sets a new precedent. #AI #CopyrightLaw #TechNews πŸš€πŸ” Breaking: Beijing Court makes a groundbreaking ruling on AI-generated art 🎨 - dispute over an image created using Stable Diffusion - creation process: prompts parameters are considered as IP. - The human user who creatively engages with AI holds the copyright. Breaking: Beijing Court makes a groundbreaking ruling on AI-generated art 🎨 - dispute over an image created using Stable Diffusion - creation process: prompts parameters are considered as IP."  
[X Link](https://x.com/Xianbao_QIAN/status/1740000138054295785)  2023-12-27T13:22Z [---] followers, [--] engagements


"MooreThreads' OS re-production of Alibaba's amazing AnimateAnyone project is now live on HF: Demo: The queue could be long so you might want to duplicate this space to skip the queue"  
[X Link](https://x.com/Xianbao_QIAN/status/1746818117316345938)  2024-01-15T08:54Z [---] followers, [---] engagements


"The same prompt doesn't work for me. 🀣 This is me "after HF sell or IPO". I don't like oil head / suits though. me when we sell or IPO @huggingface https://t.co/PizLrQcW61 me when we sell or IPO @huggingface https://t.co/PizLrQcW61"  
[X Link](https://x.com/Xianbao_QIAN/status/1750884391713464811)  2024-01-26T14:12Z [---] followers, [---] engagements


"Would the next generation (the really affordable one) Sora model be built on top of SSM Vision-RWKV from the @RWKV_AI family is now on @huggingface With reduced spatial aggregation complexity it can easily process high-resolution images"  
[X Link](https://x.com/Xianbao_QIAN/status/1764577476645470319)  2024-03-04T09:03Z [---] followers, [----] engagements


"Feel frustrated at slow uploading speed to @huggingface with VPN Some potential solutions - Use Colab as relay - ModelScope to HF - WiseModel to HF Comment below for feedbacks or other recommendations :-)"  
[X Link](https://x.com/Xianbao_QIAN/status/1771012272548188629)  2024-03-22T03:13Z [---] followers, [---] engagements


"Mini-Gemini Multimodal LLM from CUHK arrived on @huggingface - Available in different sizes - Fine-tuned on LLMs - Support high-resolution images Model: Dataset: Demo: https://huggingface.co/collections/YanweiLi/mini-gemini-6603c50b9b43d044171d0854 https://huggingface.co/collections/YanweiLi/mini-gemini-data-660463ea895a01d8f367624e https://huggingface.co/spaces/wcy1122/Mini-Gemini https://huggingface.co/collections/YanweiLi/mini-gemini-6603c50b9b43d044171d0854 https://huggingface.co/collections/YanweiLi/mini-gemini-data-660463ea895a01d8f367624e"  
[X Link](https://x.com/Xianbao_QIAN/status/1779769121527529585)  2024-04-15T07:09Z [---] followers, [----] engagements


"Very interesting article. It seems that one can 4x the context length of a LLM nearly lossless without any training. Would be nice if it can be applied to all RoPE based LLMs in Transformers. Training-Free Long-Context Scaling of Large Language Models https://arxiv.org/pdf/2402.17463.pdf https://arxiv.org/pdf/2402.17463.pdf"  
[X Link](https://x.com/Xianbao_QIAN/status/1782765560834425214)  2024-04-23T13:36Z [----] followers, [----] engagements


"@nhciao OK. I hide the private key which accesses Satoshi's BTC in One Piece of QR code hidden in a far away land on the east now go hunt it with your latest VisionPro"  
[X Link](https://x.com/Xianbao_QIAN/status/1783139565597049291)  2024-04-24T14:22Z [---] followers, [--] engagements


"@art_zucker I want a video version of it and how about naming it as Tik(Tok)Sys lol"  
[X Link](https://x.com/Xianbao_QIAN/status/1783144065091146164)  2024-04-24T14:40Z [---] followers, [--] engagements


"Piecewise Rectified Flow (PeRFlow) model/demo is now available on @huggingface - Fast high-quality image generation in just [--] steps - Work with other SD pipeline including ControlNet IP-Adapter etc. - Better consistency & diversity compared to LCM Links:"  
[X Link](https://x.com/Xianbao_QIAN/status/1783897349754122273)  2024-04-26T16:34Z [---] followers, [---] engagements


"New model from @nvidia Introducing ChatQA-1.5 a family of models that surpasses GPT-4-0613 and Command-R-Plus on RAG and conversational QA. ChatQA-1.5 has two variants: Llama3-ChatQA-1.5-8B https://t.co/H7JvIFCD48 Llama3-ChatQA-1.5-70B https://t.co/Ao3Yw8ECxA We also open source our instruction Introducing ChatQA-1.5 a family of models that surpasses GPT-4-0613 and Command-R-Plus on RAG and conversational QA. ChatQA-1.5 has two variants: Llama3-ChatQA-1.5-8B https://t.co/H7JvIFCD48 Llama3-ChatQA-1.5-70B https://t.co/Ao3Yw8ECxA We also open source our instruction"  
[X Link](https://x.com/Xianbao_QIAN/status/1786660109361565952)  2024-05-04T07:32Z [----] followers, [---] engagements


"China and France signed a Joint Statement on AI and Global Governance Takeaways - prompt the safe development of AI - deepen discussions on international AI governance - provide inclusive access for all respecting multilingualism and cultural diversity http://us.china-embassy.gov.cn/chn/zgyw/202405/t20240507_11293821.htm http://us.china-embassy.gov.cn/chn/zgyw/202405/t20240507_11293821.htm"  
[X Link](https://x.com/Xianbao_QIAN/status/1787819547044762067)  2024-05-07T12:19Z [---] followers, [---] engagements


"@katieelink @SAILhealth @nvidia Congratulations Katie"  
[X Link](https://x.com/Xianbao_QIAN/status/1788009809515790757)  2024-05-08T00:55Z [---] followers, [--] engagements


"@reach_vb Very nice but like many other nice Spaces: 😭"  
[X Link](https://x.com/Xianbao_QIAN/status/1788941805251432538)  2024-05-10T14:38Z [----] followers, [---] engagements


"@LucasAtkins7 @JustinLin610 @Alibaba_Qwen Great work Curious how long it took to fine-tune the model with just 8xH100 thx"  
[X Link](https://x.com/Xianbao_QIAN/status/1789865836133454035)  2024-05-13T03:50Z [----] followers, [---] engagements


"@LucasAtkins7 @JustinLin610 @Alibaba_Qwen Whah that's much shorter than I initially thought Taking Lambda lab pricing for a calculation: $28 * [--] * [--] = $4704 less than $5000 Not bad https://lambdalabs.com/service/gpu-cloud#pricing https://lambdalabs.com/service/gpu-cloud#pricing"  
[X Link](https://x.com/Xianbao_QIAN/status/1789879356317147360)  2024-05-13T04:44Z [----] followers, [--] engagements


"This T2I model release from Tencent is BIG Key features - DiT (The Diffusion Transformer) architecture - Multi-turn dialog - Native English / Chinese understanding Model: Project page: Technical report: https://tencent.github.io/HunyuanDiT/asset/Hunyuan_DiT_Tech_Report_05140553.pdf https://dit.hunyuan.tencent.com/ https://huggingface.co/Tencent-Hunyuan/HunyuanDiT https://tencent.github.io/HunyuanDiT/asset/Hunyuan_DiT_Tech_Report_05140553.pdf https://dit.hunyuan.tencent.com/ https://huggingface.co/Tencent-Hunyuan/HunyuanDiT"  
[X Link](https://x.com/Xianbao_QIAN/status/1790287732906054109)  2024-05-14T07:47Z [----] followers, [----] engagements


"This is now the official demo for Hunyuan-DiT the first OS DiT model that understands both Chinese and English: Spoiler alert: Watch the org and stay tuned for an update on the model https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT This T2I model release from Tencent is BIG Key features - DiT (The Diffusion Transformer) architecture - Multi-turn dialog - Native English / Chinese understanding Model: https://t.co/YFpJJubGGR Project page: https://t.co/PBWQNJMtXC Technical report: https://t.co/5J610iFZKE https://t.co/brk3M1iciJ https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT"  
[X Link](https://x.com/Xianbao_QIAN/status/1791116335155708349)  2024-05-16T14:39Z [----] followers, [----] engagements


"Want to try @deepseek_ai 's DeepSeek V2 and play with the shinny MLA but do not have enough GPUs to run 236B Here comes the the lite version: - Runs on 40G GPU - 16B total 2.4B active params - 5.7T training tokens Base: Chat: https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite-Chat https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite-Chat https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite"  
[X Link](https://x.com/Xianbao_QIAN/status/1791323576639140060)  2024-05-17T04:23Z [----] followers, [----] engagements


"Why the Apache [---] Matters for LLMs πŸ€” @01AI_Yi recently switched from a permissive & commercially friendly license to Apache [---]. And the community loved it πŸš€ @Alibaba_Qwen also had a poll on model license and the majority votes for Apache [---]. Why it is a Big Deal"  
[X Link](https://x.com/Xianbao_QIAN/status/1794017633425363386)  2024-05-24T14:48Z [----] followers, 10.5K engagements


"@01AI_Yi @Alibaba_Qwen πŸ“š Legal Simplicity: Custom licenses need costly & time-consuming legal review. Apache [---] is well-known & easier for legal teams to handle"  
[X Link](https://x.com/Xianbao_QIAN/status/1794017979715481817)  2024-05-24T14:49Z [----] followers, [---] engagements


"@01AI_Yi @Alibaba_Qwen πŸ‘©πŸ’» Developer-Friendly: Legal docs are a pain for devs Apache [---] is well-known and tech-friendly making it easier for non-native developers to understand the implications too"  
[X Link](https://x.com/Xianbao_QIAN/status/1794018104005341387)  2024-05-24T14:50Z [----] followers, [---] engagements


"@01AI_Yi @Alibaba_Qwen πŸ”— Easier Integration: Apache [---] is compatible with many other licenses simplifying tasks like model merging with models of different licensing requirements"  
[X Link](https://x.com/Xianbao_QIAN/status/1794018270636577008)  2024-05-24T14:51Z [----] followers, [---] engagements


"@01AI_Yi @Alibaba_Qwen 🚫 No Permission Needed: Custom licenses often require explicit permission and additional documentation work of filling forms creating barriers. Apache [---] removes this hurdle letting devs focus on innovation"  
[X Link](https://x.com/Xianbao_QIAN/status/1794018444859613434)  2024-05-24T14:51Z [----] followers, [---] engagements


"@01AI_Yi @Alibaba_Qwen There are a lot interesting discussions from @JustinLin610 's poll: which inspired this thread. Any other thoughts Let me know. https://x.com/JustinLin610/status/1793559737482764375 What kind of license do you prefer for our models Why is our license problematic for you Actually it is quite permissive https://x.com/JustinLin610/status/1793559737482764375 What kind of license do you prefer for our models Why is our license problematic for you Actually it is quite permissive"  
[X Link](https://x.com/Xianbao_QIAN/status/1794018826180587678)  2024-05-24T14:53Z [----] followers, [---] engagements


"There are two key components in Transformer architecture: the self-attention layer which captures relationships between tokens in context and the Feed-Forward Network (FFN) layer which stores knowledge. DeepSeek V2 introduces optimizations to both:"  
[X Link](https://x.com/Xianbao_QIAN/status/1794035627404755030)  2024-05-24T15:59Z [----] followers, [----] engagements


"Attention layer normally uses KV Cache to reduce repetitive compute but it consumes significant GPU RAM limiting concurrent requests. DeepSeek V2 introduces Multi-head Latent Attention (MLA) which stores only a small latent representation resulting in substantial RAM savings"  
[X Link](https://x.com/Xianbao_QIAN/status/1794036447173112001)  2024-05-24T16:03Z [----] followers, [----] engagements


"DeepSeek V2 utilizes [---] experts instead of the usual [--] as in Mixtral. This approach segments experts into finer granularity for higher specialization and more accurate knowledge acquisition. Activating only a small subset of experts for each token leads to efficient processing"  
[X Link](https://x.com/Xianbao_QIAN/status/1794037132623048815)  2024-05-24T16:05Z [----] followers, [----] engagements


"@huggingface TL;DR Pro / Enterprise users can simply enable Dev mode on their Space and then SSH into the Space with Dev Mode. Note that you need to commit and push files before restarting the Space otherwise content will be lost. Also changes need to be manually reloaded to take effects"  
[X Link](https://x.com/Xianbao_QIAN/status/1794724118623113540)  2024-05-26T13:35Z [----] followers, [---] engagements


"@huggingface Pros compared to Colab: - Persistent disk up to 1T with great HF hub integration for model checkpoints and dataset preparation. - Wide GPU selection including L4 A10G A100 H100 and the multi-GPU versions. - Pay as you go no monthly fee for the compute"  
[X Link](https://x.com/Xianbao_QIAN/status/1794725131954409565)  2024-05-26T13:39Z [----] followers, [---] engagements


"Actually I was wrong. Jupyter extension works great - just wait some time for the installation. (would be nice to pre-install it) This new workflow is super productive - try an idea in notebook - commit the code - share it on Space with the free Zero GPU ALL in the same window πŸš€ I love Colab but I'm moving away to @huggingface Space for the Dev Mode experience Colab for LLM can be frustrating: disk mounts backend executions and version control can be tricky. Plus no 80G H100 or multi-GPU options limit its capability. Check out this alternative https://t.co/e9N3X93zD6 πŸš€ I love Colab but I'm"  
[X Link](https://x.com/Xianbao_QIAN/status/1794736292221305245)  2024-05-26T14:24Z [----] followers, [---] engagements


"@stablequan @huggingface Good question. I don't think ZeroGPU works in VSCode but it should work on the web portal. But either way you only have very limited quota for zero GPU so it's undesirable to use zero GPU for developments"  
[X Link](https://x.com/Xianbao_QIAN/status/1794744017437270449)  2024-05-26T14:54Z [----] followers, [---] engagements


"Very cool video generated by the new version of Open-Sora plan: v1.1.0 So glad to follow along with the progress of the open video generation model Check out the demo on @huggingface running on the free ZeroGPU: https://huggingface.co/spaces/LanguageBind/Open-Sora-Plan-v1.1.0 πŸ“£πŸ“£πŸ“£We are excited to announce the release of Open-Sora Plan v1.1.0. πŸ™ŒThanks to ShareGPT4Video's capability to annotate long videos we can generate higher quality and longer videos. πŸ”₯πŸ”₯πŸ”₯We continue to open-source all data code and models https://t.co/C28gHbiPrU https://t.co/qzMvSiU9At"  
[X Link](https://x.com/Xianbao_QIAN/status/1795072357456830970)  2024-05-27T12:39Z [----] followers, [---] engagements


"@huggingface Huge thanks for aidiscovery2045 for the generated video above. Prompt: A cat is surfing"  
[X Link](https://x.com/Xianbao_QIAN/status/1795072644569555081)  2024-05-27T12:40Z [----] followers, [---] engagements


"@RemiCadene Ah This looks like such a fun game with friends controlling remote robotics arm to do things together"  
[X Link](https://x.com/Xianbao_QIAN/status/1795094674534744127)  2024-05-27T14:08Z [----] followers, [--] engagements


"@yshan2u Congratulations on the great work from Tencent"  
[X Link](https://x.com/Xianbao_QIAN/status/1796489591488454920)  2024-05-31T10:31Z [----] followers, [--] engagements


"@huggingface Skywork-MoE introduces two key training optimizations: Gating Logits normalization for better top-2 expert selection and adaptive Aux Loss for balanced expert distribution"  
[X Link](https://x.com/Xianbao_QIAN/status/1797581755358290091)  2024-06-03T10:51Z [----] followers, [---] engagements


"The Chinese community has been making remarkable contributions not only by developing outstanding open-source models but also by generously sharing their expertise and insights through comprehensive technical reports. Below is a deep dive of the models mentioned by @osanseviero The community keeps ignoring the Chinese ML ecosystem work. They are doing amazing stuff with interesting LLMs VLMs audio and diffusion models πŸ‘€ Qwen Yi DeepSeek Yuan WizardLM ChatGLM CogVLM Baichuan InternLM OpenBMB Skywork ChatTTS Ernie HunyuanDiT etc. The community keeps ignoring the Chinese ML ecosystem work. They"  
[X Link](https://x.com/Xianbao_QIAN/status/1797991281656008868)  2024-06-04T13:58Z [----] followers, [----] engagements


"@osanseviero qwen from @AlibabaGroup @Alibaba_Qwen team. Top performing Open Source model from the Open LLM leaderboard. @JustinLin610 @huybery https://x.com/huybery/status/1754537742892232972 πŸ‘‹ Qwen's latest open source work Qwen1.5 says hello to the world πŸ‘‰πŸ» More sizes: six sizes for your different needs. 0.5B 1.8B 4B 7B 14B and 72B including Base and Chat. πŸ‘‰πŸ» Better alignment: despite still trailing behind GPT-4-Turbo the largest open-source https://t.co/u82vpRYDBm https://x.com/huybery/status/1754537742892232972 πŸ‘‹ Qwen's latest open source work Qwen1.5 says hello to the world πŸ‘‰πŸ»"  
[X Link](https://x.com/Xianbao_QIAN/status/1797992101772497155)  2024-06-04T14:01Z [----] followers, [---] engagements


"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery Yi/01 models from @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning They recently switched their entire model series to Apache [--] πŸ”₯πŸ”₯πŸ”₯ And Yi-1.5-34B-Chat becomes the only OS Apache [--] permissive licensed model on lmsys arena that beats GPT4 https://x.com/parrynee/status/1797939955135881238 Despite its relatively modest size Yi-1.5-34B-Chat matches the performance of much larger models like Qwen1.5-110B and GPT-4 and even outperforms Mistral-Large. Even we are suprised. https://t.co/9TbyxABjT7"  
[X Link](https://x.com/Xianbao_QIAN/status/1797993843302031777)  2024-06-04T14:08Z [----] followers, [---] engagements


"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning DeepSeek V2 MoE from @deepseek_ai is a legendary model. With all their innovations on the model architecture they managed to force their competitors to drop the price to 1% of the original price. https://x.com/Xianbao_QIAN/status/1794034052347171055 DeepSeekV2 is a big deal. Not only because its significant improvements to both key components of Transformer: the Attention layer and FFN layer. It has also completed disrupted the Chines LLM market and forcing the"  
[X Link](https://x.com/Xianbao_QIAN/status/1797994459772469486)  2024-06-04T14:10Z [----] followers, [---] engagements


"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai Skywork recently dropped the first OS 100B+ MoE model using MoE upcycling. The upcycling vs training from scratch section in their technical report is definitely worth reading https://x.com/Xianbao_QIAN/status/1797581391351439473 Introducing Skywork-MoE from Kunlun Wanwei on @huggingface The first OS 100B+ MoE model using MoE upcycling - 22B active params matching the performance of 70B dense at 1/3 of the inference cost - Runs on 8x4090 GPUs with FP8"  
[X Link](https://x.com/Xianbao_QIAN/status/1797995425133437429)  2024-06-04T14:14Z [----] followers, [---] engagements


"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai Yuan is another MoE model that achieved comparable performance to LLAMA3 70B with only 3.7B active params (40B total params). https://x.com/Xianbao_QIAN/status/1796150983271276861 How sparse can a Mixture of Experts (MoE) model be in terms of active/total parameters ratio πŸ€” Introducing Yuan2.0-M32: 🌟 40B total params with only 3.7B active params πŸš€ Performance comparable to LLAMA3 70B πŸ’Ό Commercial use allowed without authorization https://t.co/R8OVn0GqcS"  
[X Link](https://x.com/Xianbao_QIAN/status/1797995725705629723)  2024-06-04T14:16Z [----] followers, [---] engagements


"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI team has been doing amazing work on fine-tuning coding and math model with RLHF. (Note that their models have been moved to Here is a very detailed and inspring sharing on how they managed to achieve that. https://x.com/WizardLM_AI/status/1779937307690471834 https://huggingface.co/WizardLMTeam πŸ§™β™€We not only opensource the models but also share you how we reach that πŸš€So now let's verify step-by-step to review the whole training method of"  
[X Link](https://x.com/Xianbao_QIAN/status/1797996660293034254)  2024-06-04T14:19Z [----] followers, [---] engagements


"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI ChatGLM from @thukeg and @ChatGLM is arguably the very first Chinese chat LLM. AND (Spoiler alter) they're releasing a NEW model ChatGLM4 tmr Stay tuned https://x.com/osanseviero/status/1636663692921131008 With the announcements this week many missed a big one: @thukeg released ChatGLM-6BπŸ”₯ - Open source - Chinese + English - Easily deployed on consumer GPUs - Trained for 1T token - Can be deployed on consumer GPUs (2080Ti) - Run with"  
[X Link](https://x.com/Xianbao_QIAN/status/1797997164247044303)  2024-06-04T14:21Z [----] followers, [---] engagements


"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI @thukeg @ChatGLM CogVLM is yet another strong model from @thukeg @ChatGLM that focuses on visual understanding. It beats GPT4 V/ Gemini Pro on TextVQA DocVQA and ChartQA - by a decent margin. https://x.com/reach_vb/status/1792551647039684993 Welcome CogVLM [--] ⚑ Beats GPT4 V/ Gemini Pro on TextVQA DocVQA and ChartQA - by a decent margin πŸ”₯ 19B params Llama [--] 8B (Instruct) text backbone Supports 8K context length Upto [----] X [----] resolution"  
[X Link](https://x.com/Xianbao_QIAN/status/1797997857376751885)  2024-06-04T14:24Z [----] followers, [---] engagements


"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI @thukeg @ChatGLM Baichuan is also a popular LLM that focuses on Chinese understanding and generation. https://x.com/AdeenaY8/status/1671180797037150208 baichuan-7B is trained on proprietary bilingual Chinese-English corpora optimized for Chinese and achieves SOTA performance on C-Eval and MMLU. https://t.co/hJdmB4Mpai https://x.com/AdeenaY8/status/1671180797037150208 baichuan-7B is trained on proprietary bilingual Chinese-English corpora"  
[X Link](https://x.com/Xianbao_QIAN/status/1797998346239619564)  2024-06-04T14:26Z [----] followers, [---] engagements


"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI @thukeg @ChatGLM @intern_lm is the neighboring team to @OpenMMLab (a very popular open source library on Compute Vision). Both team originated from Shanghai AI lab - a non-profit institute focusing on original research. Their recent InternLM math are performing great. https://x.com/Xianbao_QIAN/status/1795079465086689318 InternLM2-Math is great It's the first 7B model that can perfectly solve the [--] puzzle - a simplified version of Krypto game"  
[X Link](https://x.com/Xianbao_QIAN/status/1798003551005995213)  2024-06-04T14:47Z [----] followers, [---] engagements


"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI @thukeg @ChatGLM @intern_lm @OpenMMLab You might aware of @OpenMMLab's MiniCPM-Llama3-V the actual "SOTA OS VLM" Beyond this the team created impactful datasets like UltraFeedback which has fueled many DPO models e.g. Zephyr together with the coding agent framework ChatDev ahead of Devin https://x.com/OpenBMB/status/1797666243635487134 As a dedicated contributor to the open-source community OpenBMB feels deeply saddened and shocked by"  
[X Link](https://x.com/Xianbao_QIAN/status/1798010309799768435)  2024-06-04T15:13Z [----] followers, [---] engagements


"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI @thukeg @ChatGLM @intern_lm @OpenMMLab Beyond just LLMs the Chinese community continues to impress with a diverse range of work. Here are some notable highlights:"  
[X Link](https://x.com/Xianbao_QIAN/status/1798013168377287149)  2024-06-04T15:25Z [----] followers, [---] engagements


"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI @thukeg @ChatGLM @intern_lm @OpenMMLab ChatTTS: a mind blowing OS Text-To-Speech model for both English / Chinese. Check out this video and demo below: (The team needs more compute to train a larger model DM if you want to sponsor them) https://x.com/Xianbao_QIAN/status/1795490474461118804 ChatTTS: a powerful voice generation model designed for conversational scenarios - Trained on 100k hours of bilingual data - Excels in tasks like LLM assistant"  
[X Link](https://x.com/Xianbao_QIAN/status/1798014480783442251)  2024-06-04T15:30Z [----] followers, [---] engagements


"Chinese model duels with #sora πŸš€ Here is what the amazing Kling team from Kuaishou achieved - High quality video up to 1080p [--] mins 30fps - 3D spatio-temporal attention to handle complex movements in videos - able to to simulate real-world physics thanks to scaling laws link"  
[X Link](https://x.com/Xianbao_QIAN/status/1798867849987166656)  2024-06-07T00:01Z [----] followers, [----] engagements


"Supports 3D facial and body reconstruction. The video below is generated from a single full body picture"  
[X Link](https://x.com/Xianbao_QIAN/status/1798869110002180165)  2024-06-07T00:06Z [----] followers, [---] engagements


"Model can be found here as well as in Kuaishou app: unfortunately its not open sourced (yet). https://kling.kuaishou.com/ https://kling.kuaishou.com/"  
[X Link](https://x.com/Xianbao_QIAN/status/1798869257293631568)  2024-06-07T00:07Z [----] followers, [---] engagements


"Next big release: Stable Diffusion [--] is coming in a few hours What this release will bring to the OS community. Looking forward to it"  
[X Link](https://x.com/Xianbao_QIAN/status/1800871107995865109)  2024-06-12T12:41Z [----] followers, [----] engagements


"@nabla_theta "I don't need ChatGPT. I can write code myself.""  
[X Link](https://x.com/Xianbao_QIAN/status/1802584206909792438)  2024-06-17T06:08Z [----] followers, [---] engagements


"@HPCAITech @huggingface Support matrix: Green: the data has been utilized during the training phase of the model OK: although not trained the model can inference at that config. Also requires more than one 80G memory GPU and sequence parallelism"  
[X Link](https://x.com/Xianbao_QIAN/status/1802977157951701410)  2024-06-18T08:10Z [----] followers, [---] engagements


"How to breed dinosaurs China is pioneering dinosaurs breeding πŸ¦– aiming to lift people out of poverty and create wealth through this innovative industry πŸš€. Exciting times ahead (The flying watermark proudly indicates that this was generated by AI just for fun not for real)"  
[X Link](https://x.com/Xianbao_QIAN/status/1805604099431579700)  2024-06-25T14:08Z [----] followers, [--] engagements


"@Aspie96 Dinos is an easy example but I believe there will be other cases that cross boundaries and require watermarks (possibly invisible ones) to prevent misuse. It doesn't have to be watermark but has to be part of the video. Things that are not embedded can be easily discarded. wdyt"  
[X Link](https://x.com/Xianbao_QIAN/status/1805867450724237404)  2024-06-26T07:35Z [----] followers, [--] engagements


"Rumors suggest OpenAI has banned API usage initiated from unsupported regions. This is basically give away market shares to its competitor. Also I find such blocks unnecessary and ineffective as proxies like can easily bypass these restrictions. wdyt https://github.com/xianbaoqian/llm-connector https://github.com/xianbaoqian/llm-connector"  
[X Link](https://x.com/Xianbao_QIAN/status/1805887785649451329)  2024-06-26T08:56Z [----] followers, [---] engagements


"@huggingface Open LLM leaderboard V2 More challenging tests + mitigate data contamination including TIGER-Lab's MMLU-Pro and many other new evals. Results: @Alibaba_Qwen Qwen2-72B-Instruct is the top [--] model and @01AI_Yi Yi-1.5-43B-Chat ranks the 4th. https://huggingface.co/spaces/open-llm-leaderboard/blog https://huggingface.co/spaces/open-llm-leaderboard/blog"  
[X Link](https://x.com/Xianbao_QIAN/status/1805967985376731156)  2024-06-26T14:14Z [----] followers, [---] engagements


"πŸš€ Unlock the secrets of ZeroGPU on @huggingface Spaces 🌟 Building demos with high-performance inference A100 GPU FREE on @huggingface via ZeroGPU but feeling like you're navigating a maze with ZeroGPU's rules and implementation Here are the secrets that you should know"  
[X Link](https://x.com/Xianbao_QIAN/status/1808851070149668933)  2024-07-04T13:11Z [----] followers, 12.9K engagements


"πŸš€ Kwai's Rise: Kolors is top trending model and Live Portrait is the top trending Space Congratulations to the team Btw over half of this trending page are contributions from the Chinese community if you're not aware of: PersonaHub FishSpeech OpenVid-1M ControlNet Union"  
[X Link](https://x.com/Xianbao_QIAN/status/1810724794654150667)  2024-07-09T17:16Z [----] followers, [---] engagements


"πŸŽ‰ Exciting news from @ChatGLM team: new version of their VL model CogVLM2 can now understands videos ✨ By processing both frames and timestamp information this model can do temporal localization and key movement detection. πŸ‘€Could this be the game-changer for RAG on videos"  
[X Link](https://x.com/Xianbao_QIAN/status/1811755565439504485)  2024-07-12T13:32Z [----] followers, [----] engagements


"Do you know that both SD3 and AuraFlow are not trained using a normal diffusion process They're flow-based models but what does that mean πŸ€” TL;DR You can think of it as a generalization. Diffusion process is a special case of the flow. πŸ‘‡ Read blow for more information"  
[X Link](https://x.com/Xianbao_QIAN/status/1815560131347976495)  2024-07-23T01:30Z [----] followers, [----] engagements


"Instead of iteratively denoising at each step flow-based models use linear interpolation. They learn a direct mapping from noise to data space. The math is well explained in the original paper from @XingchaoL : https://arxiv.org/pdf/2405.07510 https://arxiv.org/pdf/2209.03003 https://arxiv.org/pdf/2405.07510 https://arxiv.org/pdf/2209.03003"  
[X Link](https://x.com/Xianbao_QIAN/status/1815561250698621117)  2024-07-23T01:35Z [----] followers, [---] engagements


"@ShunyuYao12 @OpenAI Congratulations"  
[X Link](https://x.com/Xianbao_QIAN/status/1818847555914256385)  2024-08-01T03:13Z [----] followers, [---] engagements


"@oceanheart_cai @Spaces Yeah [--] mins A100 GPU time"  
[X Link](https://x.com/Xianbao_QIAN/status/1820349649443844198)  2024-08-05T06:42Z [----] followers, [--] engagements


"My July slide which talks about - The difference between OS AI / traditional software ecosystem - An intro of Hugging Face offerings - My personal take on recent GenAI models - Some advices on how to run OS community - Open questions to the OS AI community Feedback welcome"  
[X Link](https://x.com/Xianbao_QIAN/status/1820443797224120697)  2024-08-05T12:56Z [----] followers, [---] engagements


"Remembering the ChatGPT 4o video Open Source models can do it now as well :) Welcome @OpenBMB 's MiniCPM-V-2.6 based on SigLip and Qwen-7B on @huggingface What's more incredible It understands videos + run well on mobile devices https://huggingface.co/openbmb/MiniCPM-V-2_6 https://huggingface.co/openbmb/MiniCPM-V-2_6"  
[X Link](https://x.com/Xianbao_QIAN/status/1820762924144857317)  2024-08-06T10:04Z [----] followers, [----] engagements


"ChatArena for Chinese speaking models from @OpenCompassX including LLMs and Multimodal models. Want to try out Chinese models but don't have a Chinese phone number This is the place to go It covers as well as Claude and ChatGPT with a lot OS models as well"  
[X Link](https://x.com/Xianbao_QIAN/status/1823029897625030814)  2024-08-12T16:12Z [----] followers, [----] engagements


"@huggingface The model tree can be found on the right side of the hub page down below the inference API section. For example: https://huggingface.co/google/gemma-2-9b https://huggingface.co/google/gemma-2-9b"  
[X Link](https://x.com/Xianbao_QIAN/status/1823347728589767020)  2024-08-13T13:15Z [----] followers, [---] engagements


"@huggingface Where did the magic come from It's not complicated just some metadata on the file. If you made a fine-tune / quantization etc. just add base_model: PATH_TO_BASE_MODEL on your file to get it indexed to boost discoverability http://README.md http://README.md http://README.md http://README.md"  
[X Link](https://x.com/Xianbao_QIAN/status/1823348391247745174)  2024-08-13T13:18Z [----] followers, [---] engagements


"@TXhunyuan I only spotted two of them. But ChatGPT claimed that he has found [--]. Is this result correct Btw could you share a bit more on how to generate two almost identical photo with hunyuan programmatically That'll be a very interesting case to make"  
[X Link](https://x.com/Xianbao_QIAN/status/1824442297762857044)  2024-08-16T13:45Z [----] followers, [--] engagements


"@Alibaba_Qwen is on fire Assuming size of ecosystem = original model + number of derived models @Alibaba_Qwen family has surpassed than @MistralAI Incredible exponential growth. Great work team (number of model calculated based on keyword matching in ID scripts below)"  
[X Link](https://x.com/Xianbao_QIAN/status/1825897641567465913)  2024-08-20T14:08Z [----] followers, [---] engagements


"@Alibaba_Qwen @MistralAI Which other metrics of open-source models on @huggingface are you interested in"  
[X Link](https://x.com/Xianbao_QIAN/status/1825898407782977851)  2024-08-20T14:11Z [----] followers, [---] engagements


"What can AI do Laser gun for mosquitos. Object detection + classification on flammable materials + robotics self-driving car (it seems). https://t.co/6qff4xWJDv https://t.co/6qff4xWJDv"  
[X Link](https://x.com/Xianbao_QIAN/status/1830611658911441171)  2024-09-02T14:20Z [----] followers, [---] engagements


"@picocreator @Microsoft @Office @RWKV_AI What It's on Windows already"  
[X Link](https://x.com/Xianbao_QIAN/status/1831158828152631306)  2024-09-04T02:34Z [----] followers, [---] engagements


"@picocreator @Microsoft @Office @RWKV_AI That's a great news @RWKV_AI"  
[X Link](https://x.com/Xianbao_QIAN/status/1831160525914575267)  2024-09-04T02:41Z [----] followers, [--] engagements


"@OxxoTweets @Microsoft @Office @RWKV_AI Looking forward for someone from MS to jump in"  
[X Link](https://x.com/Xianbao_QIAN/status/1831164294597386539)  2024-09-04T02:56Z [----] followers, [--] engagements


"Want to try FLUX LoRA from Try https://huggingface.co/spaces/Shakker-Labs/FLUX-LoRA-Gallery http://Shakker.ai https://huggingface.co/spaces/Shakker-Labs/FLUX-LoRA-Gallery http://Shakker.ai"  
[X Link](https://x.com/Xianbao_QIAN/status/1831697765975670879)  2024-09-05T14:15Z [----] followers, [---] engagements


"@maximelabonne @Alibaba_Qwen @deepseek_ai Very cool. Is it generated by a program from a csv file"  
[X Link](https://x.com/Xianbao_QIAN/status/1838183936331018672)  2024-09-23T11:49Z [----] followers, [---] engagements


"@huggingface Please be aware that the following is a very rough estimate based heavily on the assumption that derived models would include the model family name in their names which could be false in a few cases. Also non-derived models could also bear the family name"  
[X Link](https://x.com/Xianbao_QIAN/status/1838187200758042873)  2024-09-23T12:02Z [----] followers, [---] engagements


"@huggingface Surprisingly qwen has surpassed llama and became the largest model family. Congratulations to @Alibaba_Qwen team on the remarkable work. Also the model family for Gemma is also growing fast. Keeping up with the great work You can verify it yourself at: https://huggingface.co/modelssort=trending&search=llama https://huggingface.co/modelssort=trending&search=llama"  
[X Link](https://x.com/Xianbao_QIAN/status/1838188985556062388)  2024-09-23T12:09Z [----] followers, [----] engagements


"@qubitium @art_zucker Good point. Yes Seems useful for background batch process such as synthetic data generation"  
[X Link](https://x.com/Xianbao_QIAN/status/1840757022515716118)  2024-09-30T14:14Z [----] followers, [--] engagements


"Had some holiday fun playing with Emu3 and spent a bit time exploring the intricacies of Python's import system. Learned something and wrote a post. Can't wait to see how Emu3 evolves πŸš€πŸ–₯ Anyone know why Emu3-gen is taking much longer than Emu3-chat https://xianbao-qian.medium.com/predicting-the-next-multimodal-token-with-emu3-01b694d86eef https://xianbao-qian.medium.com/predicting-the-next-multimodal-token-with-emu3-01b694d86eef"  
[X Link](https://x.com/Xianbao_QIAN/status/1841492919749947449)  2024-10-02T14:58Z [----] followers, [---] engagements


"Nice emotional TTS :) Homepage: Demo from the community: https://huggingface.co/spaces/mrfakename/E2-F5-TTS https://swivid.github.io/F5-TTS/ /TTSF5-TTS [--] 20.15TTS [--] 4/ githubhttps://t.co/z7eSlYzgrp https://t.co/DMQ39CXt90 https://t.co/rqXTNMJ03H https://huggingface.co/spaces/mrfakename/E2-F5-TTS https://swivid.github.io/F5-TTS/ /TTSF5-TTS [--] 20.15TTS [--] 4/ githubhttps://t.co/z7eSlYzgrp https://t.co/DMQ39CXt90 https://t.co/rqXTNMJ03H"  
[X Link](https://x.com/Xianbao_QIAN/status/1845836190504927317)  2024-10-14T14:36Z [----] followers, [---] engagements


"Welcome Janus an autoregressive framework that unifies multimodal understanding and generation from @deepseek_ai - Super small in size only 1.8B - It uses a different visual encoder for image understanding / generation Model on @huggingface : https://huggingface.co/deepseek-ai/Janus-1.3B https://huggingface.co/deepseek-ai/Janus-1.3B"  
[X Link](https://x.com/Xianbao_QIAN/status/1847187864335065403)  2024-10-18T08:08Z [----] followers, 12.4K engagements


"Microsoft is goat. Apart from BitNet for [--] bit quantization they also had VPTQ for 1-4 bit(s) quantization Demo: The quality is quite impressive and looking forward for more tools / libraries integration to make it run even faster https://huggingface.co/spaces/VPTQ-community/VPTQ-Demo 🌟 Our Vector Post-Training Quantization (VPTQ): a cutting-edge method for Post-Training Quantization that uses Vector Quantization to achieve high accuracy on LLMs with less than [--] bits. https://t.co/5gKiRzKLLQ #LLM #Quantization #MachineLearning https://huggingface.co/spaces/VPTQ-community/VPTQ-Demo 🌟 Our"  
[X Link](https://x.com/Xianbao_QIAN/status/1848348913184686177)  2024-10-21T13:01Z [----] followers, [----] engagements


"If better data = better model then WeChat's has a massive goldmine of Chinese language data and here is their recent multimodal model that they shared with the world on @huggingface Welcome WeChat AI's VL model series: POINTS https://huggingface.co/WePOINTS/POINTS-Qwen-2-5-7B-Chat https://github.com/WePOINTS/WePOINTStab=readme-ov-file https://huggingface.co/WePOINTS/POINTS-Qwen-2-5-7B-Chat https://github.com/WePOINTS/WePOINTStab=readme-ov-file"  
[X Link](https://x.com/Xianbao_QIAN/status/1848662101004587247)  2024-10-22T09:46Z [----] followers, [----] engagements


"Do you know that @huggingface hub now has a feature to easily find all the derived models of SD [---] There are already [--] models and half of them are from @ShakkerAI_Team @Haofan_Wang Great work By the time you read it there should be more: https://huggingface.co/modelsother=base_model:adapter:stabilityai/stable-diffusion-3.5-large today is joyfully chaotic in open-source ML transformers.js v3 is out @xenovacom πŸ‘ Stable Diffusion [---] is out with 0-day diffusers support people are already training LoRAs 😱 @multimodalart New LLMs by IBM: Granite [---] released yesterday What else have I missed"  
[X Link](https://x.com/Xianbao_QIAN/status/1848907833330565614)  2024-10-23T02:02Z [----] followers, [---] engagements


"@casper_hansen_ We plan to complete the model training and evaluation no later than the end of November and will release all data models and code to the community"  
[X Link](https://x.com/Xianbao_QIAN/status/1852542492580942334)  2024-11-02T02:45Z [----] followers, [----] engagements


"FLUX + qwen = Model: Inference available on the right side bar https://huggingface.co/cfahlgren1/flux-qwen-capybara https://huggingface.co/cfahlgren1/flux-qwen-capybara"  
[X Link](https://x.com/Xianbao_QIAN/status/1852630568238023099)  2024-11-02T08:35Z [----] followers, [----] engagements


"Tencent has just dropped two SoTA models finally under Tencent's organization - Hunyuan3D: unified model for Text-to-3D and Image-to-3D - Hunyuan-large: A52B with [---] total params trained with synthetic data and up to 256k context in the pretrain model https://huggingface.co/tencent https://huggingface.co/tencent"  
[X Link](https://x.com/Xianbao_QIAN/status/1853718755643461976)  2024-11-05T08:39Z [----] followers, [----] engagements


"They have built up a demo if you want to give it a try: https://huggingface.co/spaces/tencent/Hunyuan-Large Impressive new SOTA open-source LLM in the new update of Hunyuan-Large by Tencent Model: https://t.co/oaqjKjbIfz Paper and discussion: https://t.co/K0FwYTLYKB A couple of strong points: - strong performances in math (probably from the very large Chinese pretraining datasets - https://t.co/LfC22XFsnK https://huggingface.co/spaces/tencent/Hunyuan-Large Impressive new SOTA open-source LLM in the new update of Hunyuan-Large by Tencent Model: https://t.co/oaqjKjbIfz Paper and discussion:"  
[X Link](https://x.com/Xianbao_QIAN/status/1854487273342935515)  2024-11-07T11:33Z [----] followers, [---] engagements


"If you haven't seen this please try the image editing model from ByteDance. Wish this feature can be quickly integrated into TikTok. https://huggingface.co/spaces/ByteDance/SeedEdit-APP https://huggingface.co/spaces/ByteDance/SeedEdit-APP"  
[X Link](https://x.com/Xianbao_QIAN/status/1855996493414515113)  2024-11-11T15:30Z [----] followers, [---] engagements


"@deepseek_ai released JanusFlow a powerful framework that unifies image understanding and generation in a single model: AR + rectified flow - small 1.3B - support images of [---] x [---] for understanding and generation - simple: rectified flow trained within LLMs"  
[X Link](https://x.com/Xianbao_QIAN/status/1856688719698276774)  2024-11-13T13:21Z [----] followers, [---] engagements


"If you're interested in checking the number of deviated models for each model family on @huggingface I got a Space for that: @Alibaba_Qwen @GoogleAI @UnslothAI @Meta @StabilityAI @bfl_ml are leading. https://huggingface.co/spaces/xianbao/hf-public-data-insights Alibaba is now a clear leader in open-source AI as they stated in their latest earnings call https://t.co/btEEES6Jjd https://huggingface.co/spaces/xianbao/hf-public-data-insights Alibaba is now a clear leader in open-source AI as they stated in their latest earnings call https://t.co/btEEES6Jjd"  
[X Link](https://x.com/Xianbao_QIAN/status/1857621880955637990)  2024-11-16T03:09Z [----] followers, [---] engagements


"@AlibabaGroup Model: https://huggingface.co/AIDC-AI/Marco-o1 https://huggingface.co/AIDC-AI/Marco-o1"  
[X Link](https://x.com/Xianbao_QIAN/status/1859808355663142921)  2024-11-22T03:57Z [----] followers, [---] engagements


"@reach_vb Short Nvidia and long all other US AI stocks"  
[X Link](https://x.com/Xianbao_QIAN/status/1861371084471443585)  2024-11-26T11:27Z [----] followers, [---] engagements


"Final answer = Problem solving process = Self-talking thinking and reasoning steps I like how the model is thinking out loud and trying different ways to solve the problem. Meet QwQ-32B-preview an experimental model from the @Alibaba_Qwen team with enhanced reasoning steps"  
[X Link](https://x.com/Xianbao_QIAN/status/1861935631418745238)  2024-11-28T00:50Z [----] followers, [----] engagements


"@Alibaba_Qwen Demo: Model: It's Apache [--] license. https://huggingface.co/Qwen/QwQ-32B-Preview https://huggingface.co/spaces/Qwen/QwQ-32B-preview https://huggingface.co/Qwen/QwQ-32B-Preview https://huggingface.co/spaces/Qwen/QwQ-32B-preview"  
[X Link](https://x.com/Xianbao_QIAN/status/1861935795424378963)  2024-11-28T00:51Z [----] followers, [---] engagements


"@thukeg claimed that their edge model achieved over [--] tok/s on the @Qualcomm SoC. Now these models including the VL have been open-sourced on @huggingface Great work Now the question is: Would @Qualcomm consider OS their Gen AI engine to revolutionize our mobile industry"  
[X Link](https://x.com/Xianbao_QIAN/status/1862358410710458765)  2024-11-29T04:50Z [----] followers, [--] engagements


"@thukeg @Qualcomm @huggingface Demo of the LLM: Demo of the VL: https://huggingface.co/spaces/THUDM-HF-SPACE/GLM-Edge-V-5B-Space https://huggingface.co/spaces/THUDM-HF-SPACE/GLM-Edge-1.5B-Chat-Space https://huggingface.co/spaces/THUDM-HF-SPACE/GLM-Edge-V-5B-Space https://huggingface.co/spaces/THUDM-HF-SPACE/GLM-Edge-1.5B-Chat-Space"  
[X Link](https://x.com/Xianbao_QIAN/status/1862358733235642518)  2024-11-29T04:51Z [----] followers, [--] engagements


"@AlibabaGroup is the GOAT in open source. [--] out of top ten trending mdoels on @huggingface are from Alibaba. These includes two o1 replicate with distinct approaches: Marco-o1 and QwQ as well as the latest Qwen Coder model. @bfl_ml ranks the second with two FLUX models"  
[X Link](https://x.com/Xianbao_QIAN/status/1862437597123162227)  2024-11-29T10:05Z [----] followers, [--] engagements


"@ClementDelangue Great work @Kwai_Kolors"  
[X Link](https://x.com/Xianbao_QIAN/status/1863004895135309909)  2024-11-30T23:39Z [----] followers, [---] engagements


"So true. but who could be the alternative to ChatGPT labeler Would human stand a chance"  
[X Link](https://x.com/Xianbao_QIAN/status/1863196415713050657)  2024-12-01T12:20Z [----] followers, [---] engagements


"@clefourrier @AdinaYakup Thanks to the great work @AdinaYakup Let's make them shine in the social networks"  
[X Link](https://x.com/Xianbao_QIAN/status/1864271873825210454)  2024-12-04T11:33Z [----] followers, [--] engagements


"I made a new post on the open-source Chinese LLM ecosystem which has transformed dramatically in just [--] months. Highlights include: - The rise of @Alibaba_Qwen - @deepseek_ai innovation driving affordable AI - The new explorational journey on inference scaling law - On-device AI models enabling privacy-first experiences - Chinas proactive AI governance Link below. Let me know your thoughts"  
[X Link](https://x.com/Xianbao_QIAN/status/1864296520448921914)  2024-12-04T13:11Z [----] followers, [----] engagements


"@Alibaba_Qwen @deepseek_ai https://xianbao-qian.medium.com/dec-2024-chinese-os-llms-1d1d56e4506a https://xianbao-qian.medium.com/dec-2024-chinese-os-llms-1d1d56e4506a"  
[X Link](https://x.com/Xianbao_QIAN/status/1864296575192977641)  2024-12-04T13:11Z [----] followers, [---] engagements


"@reach_vb Very close to eleven labs now"  
[X Link](https://x.com/Xianbao_QIAN/status/1864542273700335916)  2024-12-05T05:28Z [----] followers, [---] engagements


"@osanseviero o1-ish integration that makes small Gemma models even more smart"  
[X Link](https://x.com/Xianbao_QIAN/status/1864640801650954623)  2024-12-05T11:59Z [----] followers, [---] engagements


"@denny_zhou Would be nice to also have DeepSeek and ByteDance on the list"  
[X Link](https://x.com/Xianbao_QIAN/status/1865894719391400032)  2024-12-08T23:02Z [----] followers, [---] engagements


"So many incredible new models trending on @HuggingFace this week Tencents HunyuanVideo Metas Llama-3.3 Qwens QwQ-32B and more are pushing AI forward. Details below. Its all super exciting - but honestly I'm a bit concerned about the increasing number of OS models not access to EU. tencent/HunyuanVideo the largest and best open source model so far for video generation released by @TencentGlobal @TXhunyuan team. It's raising the bar for video generation models. Unfortunately this model limits its access to EU same to LLAMA. meta-llama/Llama-3.3-70B-Instruct the latest LLM release from @Meta"  
[X Link](https://x.com/Xianbao_QIAN/status/1865911098110804429)  2024-12-09T00:07Z [----] followers, [---] engagements


"TAPTRs: Track Any Point TRansformers A great application for Vision Pro"  
[X Link](https://x.com/Xianbao_QIAN/status/1866788166201717222)  2024-12-11T10:12Z [----] followers, [---] engagements


"This might be the first time Ive seen an equal number of models and datasets on the trending page. Usually models outnumber datasets. What could this suggest"  
[X Link](https://x.com/Xianbao_QIAN/status/1869489586294140999)  2024-12-18T21:07Z [----] followers, [---] engagements


"This video from Genesis a new physics engine is simply INSANE"  
[X Link](https://x.com/Xianbao_QIAN/status/1869672064254456143)  2024-12-19T09:12Z [----] followers, [---] engagements


"InternVL2.5 is delivering an impressive performance boost with enhanced reasoning capabilities its amazing how quickly these improvements followed the original release"  
[X Link](https://x.com/Xianbao_QIAN/status/1871301068996685847)  2024-12-23T21:05Z [----] followers, [--] engagements


"InternVL2.5 is delivering an impressive performance boost with enhanced reasoning capabilities its amazing how quickly these improvements followed the original release Both models & datasets released"  
[X Link](https://x.com/Xianbao_QIAN/status/1871302028032655702)  2024-12-23T21:09Z [----] followers, [---] engagements


"What's the Christmas gift from @Alibaba_Qwen team QvQ A Qwen2-VL-72B based multimodal understanding model. It excels at tackling complex tasks and delivers SoTA performance on certain leading benchmarks. Great work @JustinLin610 @huybery and the team and happy new year"  
[X Link](https://x.com/Xianbao_QIAN/status/1871692756210561355)  2024-12-24T23:01Z [----] followers, [----] engagements


"@Alibaba_Qwen @JustinLin610 @huybery Link: https://huggingface.co/Qwen/QVQ-72B-Preview https://huggingface.co/Qwen/QVQ-72B-Preview"  
[X Link](https://x.com/Xianbao_QIAN/status/1871692790381658488)  2024-12-24T23:01Z [----] followers, [---] engagements


"Theres been a lot of speculation on X about the preeminence of Chinese companies in OS AI. But is X really the best place to find the answer What people are saying in Chinese within China might be far more relevant. Come and check out this question from Zhihu"  
[X Link](https://x.com/Xianbao_QIAN/status/1877341710143103077)  2025-01-09T13:08Z [----] followers, [---] engagements


"@rpbmpn I wouldnt be surprised if in the end LLMs develop entirely new languages for their own reasoningor even multiple languages for different topics"  
[X Link](https://x.com/Xianbao_QIAN/status/1878738326246854911)  2025-01-13T09:38Z [----] followers, [--] engagements


"What could happen if we apply the same level of compute and high-quality data to a NON-Transformer architecture The new model from @Hailuo_AI has me wondering whether scaling laws might benefit even more from LINEAR ATTENTION. By reducing complexity from O(n2) down to O(n) we could bring expenses down significantly supporting more sustained growth. Give it a try on @huggingface Space and let me know what you think could this be the future direction for large-scale models Linear attention v.s. Transformer arch We have a new player in the OS LLM world. MiniMax a.k.a the company that released"  
[X Link](https://x.com/Xianbao_QIAN/status/1879406535773917566)  2025-01-15T05:53Z [----] followers, [----] engagements


"Its amazing that so many people think @TXhunyuan from Tencent is the best open source video generation model https://x.com/ClementDelangue/status/1881018041963802742 Current best open source video generation model https://x.com/ClementDelangue/status/1881018041963802742 Current best open source video generation model"  
[X Link](https://x.com/Xianbao_QIAN/status/1881190900698878347)  2025-01-20T04:03Z [----] followers, [----] engagements


"What a week The Chinese AI community has just delivered a wave of groundbreaking open-source breakthroughs all freely available to the public. A massive shoutout to the teams pushing boundaries 🌟 A few highlights: [--] DeepSeek R1 Now claims the crown as the strongest open-source reasoning model. Expect a ripple effect: its distillation potential could elevate all OS models. 🧠πŸ’₯ @deepseek_ai [--] MiniMaxs Hybrid LLM Proves linear attention can effectively scale first to hit 4M-token context windows. A paradigm shift for long-context modeling. πŸ“ˆ @Hailuo_AI [--] Qwens ORM A critical missing piece"  
[X Link](https://x.com/Xianbao_QIAN/status/1881686750722208186)  2025-01-21T12:54Z [----] followers, [----] engagements


"Both my loved teams released new models just before the Chinese New Year eve - AI never sleeps. @Alibaba_Qwen released their their versatile VL model that can handle long video understanding and can give the precise bounding boxes for object detection. It can also generate structured data output. @deepseek_ai released their latest version of the true multimodal model. It can both understand and generate images which beats the performance of @OpenAI 's DALL-E 3"  
[X Link](https://x.com/Xianbao_QIAN/status/1883993942494196097)  2025-01-27T21:42Z [----] followers, [----] engagements


"@deepseek_ai Link to the above 0.5B model: To be open sourced. https://github.com/dhcode-cpp/X-R1 https://github.com/dhcode-cpp/X-R1"  
[X Link](https://x.com/Xianbao_QIAN/status/1889196708657320199)  2025-02-11T06:16Z [----] followers, [---] engagements


"@MichaelXu25 livecodebench :)"  
[X Link](https://x.com/Xianbao_QIAN/status/1895367003248161186)  2025-02-28T06:54Z [----] followers, [---] engagements


"The growth trend for WAN2.1 is amazing @Alibaba_Wan"  
[X Link](https://x.com/Xianbao_QIAN/status/1896466353583955969)  2025-03-03T07:43Z [----] followers, [---] engagements


"Interesting. Without premium GPUs then what kind of non-premium GPU could it be"  
[X Link](https://x.com/Xianbao_QIAN/status/1899374703061639598)  2025-03-11T08:19Z [----] followers, [----] engagements


"Seedream [---] - ByteDance's new image generation foundation model. Papers available"  
[X Link](https://x.com/Xianbao_QIAN/status/1899663244773687784)  2025-03-12T03:26Z [----] followers, [----] engagements


"R1 omni weight is now available on @huggingface They can now understand emotions -- link below Alibaba just dropped R1-Omni Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning https://t.co/vO6UArJPqc Alibaba just dropped R1-Omni Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning https://t.co/vO6UArJPqc"  
[X Link](https://x.com/Xianbao_QIAN/status/1899721086826340490)  2025-03-12T07:16Z [----] followers, [---] engagements


"Hunyuan is goat on 3D generation On @huggingface if you query for image-2-3D task all top [--] trending models are contributed by @TXhunyuan"  
[X Link](https://x.com/Xianbao_QIAN/status/1902649542098796558)  2025-03-20T09:12Z [----] followers, 11.4K engagements


"If you were to build a real time chatbot app that can naturally talk to users you might be interested in TEN @TenFramework Check out this demo that seamlessly turn qwen 1.5B into a realtime chatbot with the capability to interrupt. How can I upgrade my OK Siri with that"  
[X Link](https://x.com/Xianbao_QIAN/status/1905597175507329060)  2025-03-28T12:25Z [----] followers, [---] engagements


"AReal-boba (Ant Reasoning RL also known as A-Real-boba) is a fascinating project that deserves a spot on your watchlist - Fully Open-Source: AReal-boba is completely open-source with a very reasonable training budget making it accessible for a lot of people. - Works with Limited Data: It is designed to perform well even with limited data requiring as few as [---] samples to train effectively. - Build in Public: The project follows a "build in public" approach with weekly releases giving you the opportunity to actively participate and contribute. - Inference Throughput Focused: The model is"  
[X Link](https://x.com/Xianbao_QIAN/status/1907325698102202574)  2025-04-02T06:54Z [----] followers, [---] engagements


"@victormustar Have to say that the generation is SOOO fast Every time I tried it I'm very impressed. More seriously when will be a DeepSpace that helps me to create a space that I can readily deploy on @huggingface ideally powered by @Gradio"  
[X Link](https://x.com/Xianbao_QIAN/status/1907327219162366447)  2025-04-02T07:00Z [----] followers, [---] engagements


"How to fund your personal OS project & trade US stocks like a proπŸ’° Introducing the "DeepSeek" strategy When a new model drops from DeepSeek wait for NQ to bounce back to a resistance level. Then start shorting Don't call yourself GPU poor again - The wait won't be long _"  
[X Link](https://x.com/Xianbao_QIAN/status/1908298684250591484)  2025-04-04T23:20Z [----] followers, [---] engagements


"Want your favourite object to appear in your video Check out the skyreal-a2 model on @huggingface"  
[X Link](https://x.com/Xianbao_QIAN/status/1909734999190131154)  2025-04-08T22:27Z [----] followers, [----] engagements


"HiDream-I1-Dev the best open source image generation model available in MIT license on @huggingface"  
[X Link](https://x.com/Xianbao_QIAN/status/1909771866476364031)  2025-04-09T00:54Z [----] followers, [----] engagements


"This looks impressive The arxiv link seems to be broken at the time. Looking forward to see the pdf Pangu Ultra is a 135B dense LLM trained on 13.2T tokens using [----] Ascend NPUs. Introduces depth-scaled sandwich normalization to stabilize deep model training. Outperforms dense baselines like Llama 405B and Mistral Large [--] across multiple benchmarks and achieves competitive https://t.co/hIlJUNuRLv Pangu Ultra is a 135B dense LLM trained on 13.2T tokens using [----] Ascend NPUs. Introduces depth-scaled sandwich normalization to stabilize deep model training. Outperforms dense baselines like"  
[X Link](https://x.com/Xianbao_QIAN/status/1910662568680841301)  2025-04-11T11:53Z [----] followers, [---] engagements


"I just learnt that the closest official English prounciation of Hunyuan is hwoon you-en. They have a great video generation - hunyuan video and wan are probably the two best video generation models now. They also have a great series of 3D generation models. Details below"  
[X Link](https://x.com/Xianbao_QIAN/status/1910681226480120064)  2025-04-11T13:07Z [----] followers, [---] engagements


"The open source version can be found from https://huggingface.co/tencent https://huggingface.co/tencent"  
[X Link](https://x.com/Xianbao_QIAN/status/1910681451710079433)  2025-04-11T13:08Z [----] followers, [--] engagements


"Want to try out HiDream -dev/full and compare that with Flux Here is an amazingly FREE and FAST @huggingface demo that gives you the output for ALL of them at once. Powered by @wavespeed_ai HiDream could be a game changer due to its high generation quality and open source nature with MIT license. Where is it good / bad at See below. https://t.co/wETq6ZAKKC HiDream could be a game changer due to its high generation quality and open source nature with MIT license. Where is it good / bad at See below. https://t.co/wETq6ZAKKC"  
[X Link](https://x.com/Xianbao_QIAN/status/1911203070467604831)  2025-04-12T23:41Z [----] followers, [----] engagements


"@yshan2u That's very impressive Looking forward to next gen Hunyuan model"  
[X Link](https://x.com/Xianbao_QIAN/status/1912298239078068638)  2025-04-16T00:13Z [----] followers, [--] engagements


"@Baidu_Inc is back on @huggingface with their Ernie [---] demo Looking forward to the release in June"  
[X Link](https://x.com/Xianbao_QIAN/status/1914530035353997639)  2025-04-22T04:01Z [----] followers, [--] engagements


"@Baidu_Inc @huggingface Link: https://huggingface.co/spaces/PaddlePaddle/ernie_demo https://huggingface.co/spaces/PaddlePaddle/ernie_demo"  
[X Link](https://x.com/Xianbao_QIAN/status/1914530216590143700)  2025-04-22T04:02Z [----] followers, [--] engagements


"@DRoboticsDev BPU is controlling dual SO-100 arms to fold cloth. Dataset available on @huggingface"  
[X Link](https://x.com/Xianbao_QIAN/status/1914994790984458360)  2025-04-23T10:48Z [----] followers, [----] engagements


"Li Auto opensourced their car operating system - HaloOS. Link below:"  
[X Link](https://x.com/Xianbao_QIAN/status/1915001556401102869)  2025-04-23T11:15Z [----] followers, [----] engagements


"#Qwen3 @Alibaba_Qwen is now available and one of its most exciting features is the thinking budget feature accessible on However the model card didn't disclose any related information. So what exactly is it Lets explore this further. http://chat.qwen.ai http://chat.qwen.ai"  
[X Link](https://x.com/Xianbao_QIAN/status/1917114247991201953)  2025-04-29T07:10Z [----] followers, [---] engagements


"@Alibaba_Qwen Apparently this is semantic based. As long as the content contains instructions related to "don't think" the thinking process is skipped. This works beyond English - if you say in Chinese "Don't think" " it works perfectly fine as well"  
[X Link](https://x.com/Xianbao_QIAN/status/1917119273715351585)  2025-04-29T07:30Z [----] followers, [---] engagements


"@Alibaba_Qwen But slightly change the wording will make it fail to work. e.g. Don't do deep research still triggers thinking. Very interesting"  
[X Link](https://x.com/Xianbao_QIAN/status/1917119576367894616)  2025-04-29T07:31Z [----] followers, [---] engagements


"Both @Alibaba_Qwen and @deepseek_ai dropped new models this week. What can we learn from that Labour Day holiday is the hard deadline to drive productivity πŸ€–πŸ› πŸš€ #AI #LabourDay Happy labour day"  
[X Link](https://x.com/Xianbao_QIAN/status/1917514141864190366)  2025-04-30T09:39Z [----] followers, [---] engagements


"Anice collection of Unified Multimodal Models"  
[X Link](https://x.com/Xianbao_QIAN/status/1920445525742268632)  2025-05-08T11:47Z [----] followers, [----] engagements


"@Alibaba_Qwen webdev Very cool"  
[X Link](https://x.com/Xianbao_QIAN/status/1920806335563628983)  2025-05-09T11:41Z [----] followers, [---] engagements


"I started to follow Baidu on @huggingface because they have subscribed to enterprise account. What's going to happen πŸ‘€"  
[X Link](https://x.com/Xianbao_QIAN/status/1930635059662668046)  2025-06-05T14:37Z [----] followers, [----] engagements


"Xi Jinping Holds Telephone Conversation with U.S. President Trump Big news. What's going to happen next ----- Below is the ChatGPT translation of Chinese statement. Date: June [--] [----] 10:49 PM Source: Xinhua News Agency On the evening of June [--] President Xi Jinping held a telephone conversation with U.S. President Donald Trump at the latter's request. Key Points from President Xis Remarks: President Xi emphasized that correcting the course of China-U.S. relations is like steering a large ship requiring careful guidance and a clear direction. It is especially important to eliminate various"  
[X Link](https://x.com/Xianbao_QIAN/status/1930648205018239430)  2025-06-05T15:29Z [----] followers, [---] engagements


"My two coins from this Aider LLM Leaderboards result @GeminiApp is now leading. A month ago I consider it ranking the second but now it has surpassed @OpenAI 's GPT o3 in both performance and latency. Great work @GoogleDeepMind @deepseek_ai continues to be highly competitive especially if cost-effectiveness is critical or if you want full control over your data. Open source to the win"  
[X Link](https://x.com/Xianbao_QIAN/status/1932214365048230216)  2025-06-09T23:12Z [----] followers, [----] engagements


"Bytedance is goat"  
[X Link](https://x.com/Xianbao_QIAN/status/1932972909209022795)  2025-06-12T01:27Z [----] followers, [---] engagements


"ByteDance is now added to @calebfahlgren 's @huggingface heatmap and it shows a very clear upward trend of open source AI contributions. Keep up the excellent work @ByteDance_Seed"  
[X Link](https://x.com/Xianbao_QIAN/status/1933565639425470505)  2025-06-13T16:42Z [----] followers, [---] engagements


"Tencent is goat Apart from the 3D generation model the new music generation also worth trying. Check out the demo on their homepage and model weights from @huggingface"  
[X Link](https://x.com/Xianbao_QIAN/status/1934592172428505504)  2025-06-16T12:41Z [----] followers, [---] engagements


"New LLM from @MiniMax__AI is now available on @huggingface - Hybrid linear attention friendly to inference. - Reasoning model with two variants with 40k/80k thinking budget - Apache [---] license - Context length of 1M - Great support of function calling"  
[X Link](https://x.com/Xianbao_QIAN/status/1934784688121631089)  2025-06-17T01:26Z [----] followers, [---] engagements


"A new virtual try-on model by @ZJU_China and @vivo_europe opensourced on @huggingface - generating video while keeping - fully open sourced: inference + weight; training code coming - wan [---] backbone with full attention for spatiotemporal consistency - CC By NC license vivoMagicTryOn  https://t.co/1yd29hx85o vivoMagicTryOn  https://t.co/1yd29hx85o"  
[X Link](https://x.com/Xianbao_QIAN/status/1934794822600282148)  2025-06-17T02:06Z [----] followers, [---] engagements


"@ZJU_China @vivo_europe @huggingface https://huggingface.co/LuckyLiGY/MagicTryOn https://huggingface.co/LuckyLiGY/MagicTryOn"  
[X Link](https://x.com/Xianbao_QIAN/status/1934794852044288276)  2025-06-17T02:06Z [----] followers, [---] engagements


"o3-pro is now able to fetch and display images. But there are two few images. There should be more :)"  
[X Link](https://x.com/Xianbao_QIAN/status/1935122558434808213)  2025-06-17T23:49Z [----] followers, [---] engagements


"HF has 100k spaces with this feature turning popular spaces into MCP compatible HF hub will become the MCP hub. Hugging Face Spaces the world's largest AI app directory is now MCP-compatible 🀯 Here turning an entire website into Ghibli in one shot with Claude Code to demonstrate what's now possible: instantly access any A I tool right in your LLM client https://t.co/isJxweJMyV Hugging Face Spaces the world's largest AI app directory is now MCP-compatible 🀯 Here turning an entire website into Ghibli in one shot with Claude Code to demonstrate what's now possible: instantly access any A I"  
[X Link](https://x.com/Xianbao_QIAN/status/1935153064769696128)  2025-06-18T01:50Z [----] followers, [---] engagements


"If you want a free version of ChatGPT 4o to edit image with prompts try OmniGen2 on @huggingface - Model & code open sourced technical report available - Apache [--] license - up to [----] x [----] Coolest part It's fully open sourced so you can call this model with MCP. All you need to do it to launch the app with .launch(mcp_server=True) https://huggingface.co/spaces/OmniGen2/OmniGen2 https://huggingface.co/spaces/OmniGen2/OmniGen2"  
[X Link](https://x.com/Xianbao_QIAN/status/1937471418700288153)  2025-06-24T11:22Z [----] followers, 10.8K engagements


"@soul_surfer78 @_akhaliq @huggingface OmniGen2 natively requires an NVIDIA RTX [----] or an equivalent GPU with approximately 17GB of VRAM. For devices with less VRAM you can enable CPU Offload to run the model"  
[X Link](https://x.com/Xianbao_QIAN/status/1937652324274700805)  2025-06-24T23:21Z [----] followers, [---] engagements


"Only need RTX3090 or 17GB of vram to run OmniGen2 natively requires an NVIDIA RTX [----] or an equivalent GPU with approximately 17GB of VRAM. For devices with less VRAM you can enable CPU Offload to run the model. If you want a free version of ChatGPT 4o to edit image with prompts try OmniGen2 on @huggingface - Model & code open sourced technical report available - Apache [--] license - up to [----] x [----] Coolest part It's fully open sourced so you can call this model with MCP. All you https://t.co/jMT4HX6AyP If you want a free version of ChatGPT 4o to edit image with prompts try OmniGen2 on"  
[X Link](https://x.com/Xianbao_QIAN/status/1937652467468238951)  2025-06-24T23:22Z [----] followers, [---] engagements


"@TencentHunyuan Tencent rocks Both their new LLM and 3D generation on are on @huggingface trending list"  
[X Link](https://x.com/Xianbao_QIAN/status/1938956380897194219)  2025-06-28T13:43Z [----] followers, [---] engagements


"New VL model from Alibaba: Ovis U1 https://huggingface.co/AIDC-AI/Ovis-U1-3B https://huggingface.co/AIDC-AI/Ovis-U1-3B"  
[X Link](https://x.com/Xianbao_QIAN/status/1939324782522335688)  2025-06-29T14:07Z [----] followers, [----] engagements


"@zephyr_z9 Ant group is completely detached from Alibaba due to regulations"  
[X Link](https://x.com/Xianbao_QIAN/status/1939330132625539094)  2025-06-29T14:28Z [----] followers, [---] engagements


"Pretrain performance: They claimed that the model is more performance on math and reasoning than DeepSeek V3. Is that a hint that one can train a better R1 from Ernie"  
[X Link](https://x.com/Xianbao_QIAN/status/1939507249665475019)  2025-06-30T02:12Z [----] followers, [---] engagements


"Paddle has a long history of being one of the best OCR tool. And Ernie is not letting me down that it achieved better Doc & Chart understanding than OpenAI-o1. And presumably the API would be way cheaper than o1 as well"  
[X Link](https://x.com/Xianbao_QIAN/status/1939507655099457902)  2025-06-30T02:13Z [----] followers, [---] engagements


"Also worth noting that the model is trained using Paddle :) And then converted to PyTorch weights"  
[X Link](https://x.com/Xianbao_QIAN/status/1939513055207174300)  2025-06-30T02:35Z [----] followers, [---] engagements


"The technical report is very detailed and well written. It also contains a lot of engineering details on how Paddle solved the challenges of training LLMs on massive GPU cluster (10k). For example this framework native solution for zero cost checkpoint is very interesting"  
[X Link](https://x.com/Xianbao_QIAN/status/1939514252261544443)  2025-06-30T02:40Z [----] followers, [---] engagements


"If you're interested in learning more about the model. Here are some useful links: - Technical report: - Blog: - Model: https://huggingface.co/baidu https://ernie.baidu.com/blog/posts/ernie4.5/ https://yiyan.baidu.com/blog/publication/ERNIE_Technical_Report.pdf https://huggingface.co/baidu https://ernie.baidu.com/blog/posts/ernie4.5/ https://yiyan.baidu.com/blog/publication/ERNIE_Technical_Report.pdf"  
[X Link](https://x.com/Xianbao_QIAN/status/1939517883866087698)  2025-06-30T02:54Z [----] followers, [---] engagements


"Half of my friend circle attended the ModelScope event. Mind blowing Looking forward to see more developer focused events in China. Great work @MaaSAI42 team"  
[X Link](https://x.com/Xianbao_QIAN/status/1939665935327224153)  2025-06-30T12:42Z [----] followers, [---] engagements


"You won't believed that this audio track is added by a model. It aligns perfectly with the video. The ThinkSound model might have just unveiled veo3's hidden magic. Check it out on @huggingface Space: Github: https://github.com/liuhuadai/ThinkSound https://huggingface.co/spaces/FunAudioLLM/ThinkSound https://github.com/liuhuadai/ThinkSound https://huggingface.co/spaces/FunAudioLLM/ThinkSound"  
[X Link](https://x.com/Xianbao_QIAN/status/1940048043799441421)  2025-07-01T14:01Z [----] followers, 13.3K engagements


"Amazing Is this the first open source video + audio generation model with this level of lip sync capability Here is the prompt: A muscular man with a beard and tattoos clenching his fists and glaring angrily at the camera speaking: "I am more than your prompt I am strong" veo3-ish video + audio generation using open source model Great work MTVCraft Detailed below: https://t.co/lhBXyOZanV veo3-ish video + audio generation using open source model Great work MTVCraft Detailed below: https://t.co/lhBXyOZanV"  
[X Link](https://x.com/Xianbao_QIAN/status/1940401159300386921)  2025-07-02T13:24Z [----] followers, [---] engagements


"OMG this is mind blowing Driving in a truly infinite open-world sandbox game feels magical Youll never know where youll end up next. Mirage can generate games across a wide range of genresfrom racing🏎 to RPGsπŸ•΄ to platformers🎴 4/ https://t.co/Arud47x8Fv Mirage can generate games across a wide range of genresfrom racing🏎 to RPGsπŸ•΄ to platformers🎴 4/ https://t.co/Arud47x8Fv"  
[X Link](https://x.com/Xianbao_QIAN/status/1940562578968866835)  2025-07-03T00:05Z [----] followers, [---] engagements


"@rryssf_ @thukeg +1 Looking forward to more show cases from real world scenarios"  
[X Link](https://x.com/Xianbao_QIAN/status/1940566471815708921)  2025-07-03T00:21Z [----] followers, [--] engagements


"RL generalizes SFT doesn't. Self-exploration and feedback leads to AGI not SFT. Does that mean once we reach certain intelligence level (able to generate valid candidate) the value of human generated data is diminishing. And feedback especially rule generated massive scale feedback is becoming critical. Which means that the proportion of human contribution on the way of pursuing AGI is diminishing yet compute becomes everything (if it hasn't been). Long NVAMDGOOGTSMSMIC People are racing to push math reasoning performance in #LLMsbut have we really asked why The common assumption is that"  
[X Link](https://x.com/Xianbao_QIAN/status/1940597531756683603)  2025-07-03T02:24Z [----] followers, [----] engagements


"@xeophon_ There are so many orgs under Alibaba :sigh: I wish they could create a "alibaba" org like Tencent or baidu :)"  
[X Link](https://x.com/Xianbao_QIAN/status/1940751046995071385)  2025-07-03T12:34Z [----] followers, [--] engagements


"LLM built by geologist for the geologist #AI2S https://huggingface.co/GeoGPT-Research-Project/Qwen2.5-72B-GeoGPT https://huggingface.co/GeoGPT-Research-Project/Qwen2.5-72B-GeoGPT"  
[X Link](https://x.com/Xianbao_QIAN/status/1940780737843876026)  2025-07-03T14:32Z [----] followers, 13.2K engagements


"@Zai_org has just launched a ppt generation feature. Really impressed Instructions below:"  
[X Link](https://x.com/Xianbao_QIAN/status/1941382013384261701)  2025-07-05T06:21Z [----] followers, [---] engagements


"Am I late realizing that LLM now evolved their own languages of understanding & generating images Future of languages of AGI era might start from there - a multi-dimensional language seamlessly integrated into the reasoning pipeline. so will LLM dream & think in images"  
[X Link](https://x.com/Xianbao_QIAN/status/1942562918098428214)  2025-07-08T12:34Z [----] followers, [---] engagements


"Whaaa GLM-4.1V-9B has became the top [--] trending model. Have you tried it"  
[X Link](https://x.com/Xianbao_QIAN/status/1942778687189049444)  2025-07-09T02:51Z [----] followers, [----] engagements


"@Baidu_Inc keeps surprising me They have not only released the detailed paper they've also open source the industrial grade Entire training stack on Nvidia GPUs on top of @PaddlePaddle This is incredibly rare and deserves a huge shoutout https://github.com/PaddlePaddle/ERNIE/tree/develop/examples/pre-training I'll never forget this model as well as the relationship between pretrain SFT and RLHF. https://t.co/f9lmxL3Mw0 https://github.com/PaddlePaddle/ERNIE/tree/develop/examples/pre-training I'll never forget this model as well as the relationship between pretrain SFT and RLHF."  
[X Link](https://x.com/Xianbao_QIAN/status/1943311360961777749)  2025-07-10T14:08Z [----] followers, [---] engagements


"More details from their blog scalable pretrain with MuonClip no loss spike for 15.5T token strong agentic tool use powered by large scale multi-turn synthetic data seamless integrated with agent / coding framework such as owl Cline RooCode native integration with vLLM SGLang ktransformers trained with RL for rule verifiable tasks (code math) also solved sparse award issue on non-verifiable task via self-judging Looking forward to the technical report Kimi K2 is open sourced on @huggingface - 1T MoE 32B active params - Excellent coding & Tool use & Math - Not a thinking model - Both BASE and"  
[X Link](https://x.com/Xianbao_QIAN/status/1943688679375183894)  2025-07-11T15:07Z [----] followers, [----] engagements


"20 mins after the kimi K2 release"  
[X Link](https://x.com/Xianbao_QIAN/status/1943690470397554697)  2025-07-11T15:14Z [----] followers, [----] engagements


"I was using Claude Code with K2 and I got rate limited. I didn't notice anything wrong - just post-launch congestion until I checked at the numbers WHAT 468_168 TPM which is 7k+ tokens per second and you're tell me it's a model with 1T total params what kind of infra it is Did people say that they don't have access to GB200 and only have Ascend or H20"  
[X Link](https://x.com/Xianbao_QIAN/status/1943702468006998465)  2025-07-11T16:02Z [----] followers, 20.5K engagements


"How did they managed to serve a single user 7k+ tokens / seconds What kind of infra are they using Their open source project Mooncake can shed some light: private memory per request optimization is the key to improve the performance of LLM serving infra. This is done by KV Cache centric PD disaggregation further compression by MLA from @deepseek_ai More details can be found from https://github.com/kvcache-ai/Mooncake I was using Claude Code with K2 and I got rate limited. I didn't notice anything wrong - just post-launch congestion until I checked at the numbers WHAT 468_168 TPM which is 7k+"  
[X Link](https://x.com/Xianbao_QIAN/status/1943705699059728880)  2025-07-11T16:15Z [----] followers, [----] engagements


"After @Kimi_Moonshot 's K2 release I think that we have a new set of [--] tigers in LLM industry now (i.e. the so called AI ). And they ALL invest heavily in open source ecosystem. (note that this list excludes ByteDance and Alibaba) They're: @Zai_org @deepseek_ai @MiniMax__AI @Kimi_Moonshot @OpenBMB and @StepFun_ai Keep up your great work"  
[X Link](https://x.com/Xianbao_QIAN/status/1943727253143400556)  2025-07-11T17:41Z [----] followers, [----] engagements


"@casper_hansen_ K2 is DS arch so it should be fairly easy for providers to serve them"  
[X Link](https://x.com/Xianbao_QIAN/status/1943730689431703593)  2025-07-11T17:54Z [----] followers, [---] engagements


"@alalamin19 If you hover over the bar you'll see the model name. They're mostly comparing with GPT [---] and Claude Optus except SWE bench-multilingual where they used Claude Sonnet because the cost of Claude [--] Opus was prohibitive. https://moonshotai.github.io/Kimi-K2/ https://moonshotai.github.io/Kimi-K2/"  
[X Link](https://x.com/Xianbao_QIAN/status/1943874595942969800)  2025-07-12T03:26Z [----] followers, [---] engagements


"@Despierta_1 @zephyr_z9 @Kimi_Moonshot @Zai_org @deepseek_ai They stopped foundational model pretrain"  
[X Link](https://x.com/Xianbao_QIAN/status/1943876832547205166)  2025-07-12T03:35Z [----] followers, [--] engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing

@Xianbao_QIAN Avatar @Xianbao_QIAN Tiezhen WANG

Tiezhen WANG posts on X about llm, ai, demo, moe the most. They currently have [-----] followers and [---] posts still getting attention that total [------] engagements in the last [--] hours.

Engagements: [------] #

Engagements Line Chart

  • [--] Week [-------] +632%
  • [--] Month [-------] +7.30%
  • [--] Months [---------] +218%
  • [--] Year [---------] +246%

Mentions: [--] #

Mentions Line Chart

  • [--] Week [--] -7.70%
  • [--] Month [--] +48%
  • [--] Months [---] +36%
  • [--] Year [---] +84%

Followers: [-----] #

Followers Line Chart

  • [--] Week [-----] +2.90%
  • [--] Month [-----] +5%
  • [--] Months [-----] +47%
  • [--] Year [-----] +118%

CreatorRank: [-------] #

CreatorRank Line Chart

Social Influence

Social category influence technology brands finance stocks countries social networks celebrities travel destinations exchanges automotive brands cryptocurrencies

Social topic influence llm, ai, demo, moe, performance, release, the new, up to, in the, baidu

Top accounts mentioned or mentioned by @huggingface @alibabaqwen @deepseekai @zaiorg @01aiyi @justinlin610 @alibabagroup @tencenthunyuan @osanseviero @huybery @thukeg @txhunyuan @kaifulee @parrynee @richardllin @senseyewinning @chatglm @kimimoonshot @meituanlongcat @internlm

Top assets mentioned Microsoft Corp. (MSFT) Flux (FLUX)

Top Social Posts

Top posts by engagements in the last [--] hours

"History in the makingπŸ“– Join the first ever Chinese LLM novel writing competition. Let's explore the boundaries of large language models and see what stories they can spin. Bonus gift for using open source models. πŸ“· Details below. #AIWriting #LLM https://mp.weixin.qq.com/s/9sNOrolEC34OxAZXKFBP_Q https://mp.weixin.qq.com/s/9sNOrolEC34OxAZXKFBP_Q"
X Link 2023-08-12T13:39Z [----] followers, [---] engagements

"Want to know what's involved in training the code model Check this presentation from Loubna :-) Hint: it's just about burning GPUs"
X Link 2023-10-18T13:27Z [---] followers, [--] engagements

"Breaking: Beijing Court makes a groundbreaking ruling on AI-generated art 🎨 - dispute over an image created using Stable Diffusion - creation process: prompts parameters are considered as IP. - The human user who creatively engages with AI holds the copyright"
X Link 2023-12-27T13:15Z [---] followers, [---] engagements

"πŸ“£πŸŽ¨ Breaking: Beijing court declares AI-generated content copyrightable Landmark ruling sets a new precedent. #AI #CopyrightLaw #TechNews πŸš€πŸ” Breaking: Beijing Court makes a groundbreaking ruling on AI-generated art 🎨 - dispute over an image created using Stable Diffusion - creation process: prompts parameters are considered as IP. - The human user who creatively engages with AI holds the copyright. Breaking: Beijing Court makes a groundbreaking ruling on AI-generated art 🎨 - dispute over an image created using Stable Diffusion - creation process: prompts parameters are considered as IP."
X Link 2023-12-27T13:22Z [---] followers, [--] engagements

"MooreThreads' OS re-production of Alibaba's amazing AnimateAnyone project is now live on HF: Demo: The queue could be long so you might want to duplicate this space to skip the queue"
X Link 2024-01-15T08:54Z [---] followers, [---] engagements

"The same prompt doesn't work for me. 🀣 This is me "after HF sell or IPO". I don't like oil head / suits though. me when we sell or IPO @huggingface https://t.co/PizLrQcW61 me when we sell or IPO @huggingface https://t.co/PizLrQcW61"
X Link 2024-01-26T14:12Z [---] followers, [---] engagements

"Would the next generation (the really affordable one) Sora model be built on top of SSM Vision-RWKV from the @RWKV_AI family is now on @huggingface With reduced spatial aggregation complexity it can easily process high-resolution images"
X Link 2024-03-04T09:03Z [---] followers, [----] engagements

"Feel frustrated at slow uploading speed to @huggingface with VPN Some potential solutions - Use Colab as relay - ModelScope to HF - WiseModel to HF Comment below for feedbacks or other recommendations :-)"
X Link 2024-03-22T03:13Z [---] followers, [---] engagements

"Mini-Gemini Multimodal LLM from CUHK arrived on @huggingface - Available in different sizes - Fine-tuned on LLMs - Support high-resolution images Model: Dataset: Demo: https://huggingface.co/collections/YanweiLi/mini-gemini-6603c50b9b43d044171d0854 https://huggingface.co/collections/YanweiLi/mini-gemini-data-660463ea895a01d8f367624e https://huggingface.co/spaces/wcy1122/Mini-Gemini https://huggingface.co/collections/YanweiLi/mini-gemini-6603c50b9b43d044171d0854 https://huggingface.co/collections/YanweiLi/mini-gemini-data-660463ea895a01d8f367624e"
X Link 2024-04-15T07:09Z [---] followers, [----] engagements

"Very interesting article. It seems that one can 4x the context length of a LLM nearly lossless without any training. Would be nice if it can be applied to all RoPE based LLMs in Transformers. Training-Free Long-Context Scaling of Large Language Models https://arxiv.org/pdf/2402.17463.pdf https://arxiv.org/pdf/2402.17463.pdf"
X Link 2024-04-23T13:36Z [----] followers, [----] engagements

"@nhciao OK. I hide the private key which accesses Satoshi's BTC in One Piece of QR code hidden in a far away land on the east now go hunt it with your latest VisionPro"
X Link 2024-04-24T14:22Z [---] followers, [--] engagements

"@art_zucker I want a video version of it and how about naming it as Tik(Tok)Sys lol"
X Link 2024-04-24T14:40Z [---] followers, [--] engagements

"Piecewise Rectified Flow (PeRFlow) model/demo is now available on @huggingface - Fast high-quality image generation in just [--] steps - Work with other SD pipeline including ControlNet IP-Adapter etc. - Better consistency & diversity compared to LCM Links:"
X Link 2024-04-26T16:34Z [---] followers, [---] engagements

"New model from @nvidia Introducing ChatQA-1.5 a family of models that surpasses GPT-4-0613 and Command-R-Plus on RAG and conversational QA. ChatQA-1.5 has two variants: Llama3-ChatQA-1.5-8B https://t.co/H7JvIFCD48 Llama3-ChatQA-1.5-70B https://t.co/Ao3Yw8ECxA We also open source our instruction Introducing ChatQA-1.5 a family of models that surpasses GPT-4-0613 and Command-R-Plus on RAG and conversational QA. ChatQA-1.5 has two variants: Llama3-ChatQA-1.5-8B https://t.co/H7JvIFCD48 Llama3-ChatQA-1.5-70B https://t.co/Ao3Yw8ECxA We also open source our instruction"
X Link 2024-05-04T07:32Z [----] followers, [---] engagements

"China and France signed a Joint Statement on AI and Global Governance Takeaways - prompt the safe development of AI - deepen discussions on international AI governance - provide inclusive access for all respecting multilingualism and cultural diversity http://us.china-embassy.gov.cn/chn/zgyw/202405/t20240507_11293821.htm http://us.china-embassy.gov.cn/chn/zgyw/202405/t20240507_11293821.htm"
X Link 2024-05-07T12:19Z [---] followers, [---] engagements

"@katieelink @SAILhealth @nvidia Congratulations Katie"
X Link 2024-05-08T00:55Z [---] followers, [--] engagements

"@reach_vb Very nice but like many other nice Spaces: 😭"
X Link 2024-05-10T14:38Z [----] followers, [---] engagements

"@LucasAtkins7 @JustinLin610 @Alibaba_Qwen Great work Curious how long it took to fine-tune the model with just 8xH100 thx"
X Link 2024-05-13T03:50Z [----] followers, [---] engagements

"@LucasAtkins7 @JustinLin610 @Alibaba_Qwen Whah that's much shorter than I initially thought Taking Lambda lab pricing for a calculation: $28 * [--] * [--] = $4704 less than $5000 Not bad https://lambdalabs.com/service/gpu-cloud#pricing https://lambdalabs.com/service/gpu-cloud#pricing"
X Link 2024-05-13T04:44Z [----] followers, [--] engagements

"This T2I model release from Tencent is BIG Key features - DiT (The Diffusion Transformer) architecture - Multi-turn dialog - Native English / Chinese understanding Model: Project page: Technical report: https://tencent.github.io/HunyuanDiT/asset/Hunyuan_DiT_Tech_Report_05140553.pdf https://dit.hunyuan.tencent.com/ https://huggingface.co/Tencent-Hunyuan/HunyuanDiT https://tencent.github.io/HunyuanDiT/asset/Hunyuan_DiT_Tech_Report_05140553.pdf https://dit.hunyuan.tencent.com/ https://huggingface.co/Tencent-Hunyuan/HunyuanDiT"
X Link 2024-05-14T07:47Z [----] followers, [----] engagements

"This is now the official demo for Hunyuan-DiT the first OS DiT model that understands both Chinese and English: Spoiler alert: Watch the org and stay tuned for an update on the model https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT This T2I model release from Tencent is BIG Key features - DiT (The Diffusion Transformer) architecture - Multi-turn dialog - Native English / Chinese understanding Model: https://t.co/YFpJJubGGR Project page: https://t.co/PBWQNJMtXC Technical report: https://t.co/5J610iFZKE https://t.co/brk3M1iciJ https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT"
X Link 2024-05-16T14:39Z [----] followers, [----] engagements

"Want to try @deepseek_ai 's DeepSeek V2 and play with the shinny MLA but do not have enough GPUs to run 236B Here comes the the lite version: - Runs on 40G GPU - 16B total 2.4B active params - 5.7T training tokens Base: Chat: https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite-Chat https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite-Chat https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite"
X Link 2024-05-17T04:23Z [----] followers, [----] engagements

"Why the Apache [---] Matters for LLMs πŸ€” @01AI_Yi recently switched from a permissive & commercially friendly license to Apache [---]. And the community loved it πŸš€ @Alibaba_Qwen also had a poll on model license and the majority votes for Apache [---]. Why it is a Big Deal"
X Link 2024-05-24T14:48Z [----] followers, 10.5K engagements

"@01AI_Yi @Alibaba_Qwen πŸ“š Legal Simplicity: Custom licenses need costly & time-consuming legal review. Apache [---] is well-known & easier for legal teams to handle"
X Link 2024-05-24T14:49Z [----] followers, [---] engagements

"@01AI_Yi @Alibaba_Qwen πŸ‘©πŸ’» Developer-Friendly: Legal docs are a pain for devs Apache [---] is well-known and tech-friendly making it easier for non-native developers to understand the implications too"
X Link 2024-05-24T14:50Z [----] followers, [---] engagements

"@01AI_Yi @Alibaba_Qwen πŸ”— Easier Integration: Apache [---] is compatible with many other licenses simplifying tasks like model merging with models of different licensing requirements"
X Link 2024-05-24T14:51Z [----] followers, [---] engagements

"@01AI_Yi @Alibaba_Qwen 🚫 No Permission Needed: Custom licenses often require explicit permission and additional documentation work of filling forms creating barriers. Apache [---] removes this hurdle letting devs focus on innovation"
X Link 2024-05-24T14:51Z [----] followers, [---] engagements

"@01AI_Yi @Alibaba_Qwen There are a lot interesting discussions from @JustinLin610 's poll: which inspired this thread. Any other thoughts Let me know. https://x.com/JustinLin610/status/1793559737482764375 What kind of license do you prefer for our models Why is our license problematic for you Actually it is quite permissive https://x.com/JustinLin610/status/1793559737482764375 What kind of license do you prefer for our models Why is our license problematic for you Actually it is quite permissive"
X Link 2024-05-24T14:53Z [----] followers, [---] engagements

"There are two key components in Transformer architecture: the self-attention layer which captures relationships between tokens in context and the Feed-Forward Network (FFN) layer which stores knowledge. DeepSeek V2 introduces optimizations to both:"
X Link 2024-05-24T15:59Z [----] followers, [----] engagements

"Attention layer normally uses KV Cache to reduce repetitive compute but it consumes significant GPU RAM limiting concurrent requests. DeepSeek V2 introduces Multi-head Latent Attention (MLA) which stores only a small latent representation resulting in substantial RAM savings"
X Link 2024-05-24T16:03Z [----] followers, [----] engagements

"DeepSeek V2 utilizes [---] experts instead of the usual [--] as in Mixtral. This approach segments experts into finer granularity for higher specialization and more accurate knowledge acquisition. Activating only a small subset of experts for each token leads to efficient processing"
X Link 2024-05-24T16:05Z [----] followers, [----] engagements

"@huggingface TL;DR Pro / Enterprise users can simply enable Dev mode on their Space and then SSH into the Space with Dev Mode. Note that you need to commit and push files before restarting the Space otherwise content will be lost. Also changes need to be manually reloaded to take effects"
X Link 2024-05-26T13:35Z [----] followers, [---] engagements

"@huggingface Pros compared to Colab: - Persistent disk up to 1T with great HF hub integration for model checkpoints and dataset preparation. - Wide GPU selection including L4 A10G A100 H100 and the multi-GPU versions. - Pay as you go no monthly fee for the compute"
X Link 2024-05-26T13:39Z [----] followers, [---] engagements

"Actually I was wrong. Jupyter extension works great - just wait some time for the installation. (would be nice to pre-install it) This new workflow is super productive - try an idea in notebook - commit the code - share it on Space with the free Zero GPU ALL in the same window πŸš€ I love Colab but I'm moving away to @huggingface Space for the Dev Mode experience Colab for LLM can be frustrating: disk mounts backend executions and version control can be tricky. Plus no 80G H100 or multi-GPU options limit its capability. Check out this alternative https://t.co/e9N3X93zD6 πŸš€ I love Colab but I'm"
X Link 2024-05-26T14:24Z [----] followers, [---] engagements

"@stablequan @huggingface Good question. I don't think ZeroGPU works in VSCode but it should work on the web portal. But either way you only have very limited quota for zero GPU so it's undesirable to use zero GPU for developments"
X Link 2024-05-26T14:54Z [----] followers, [---] engagements

"Very cool video generated by the new version of Open-Sora plan: v1.1.0 So glad to follow along with the progress of the open video generation model Check out the demo on @huggingface running on the free ZeroGPU: https://huggingface.co/spaces/LanguageBind/Open-Sora-Plan-v1.1.0 πŸ“£πŸ“£πŸ“£We are excited to announce the release of Open-Sora Plan v1.1.0. πŸ™ŒThanks to ShareGPT4Video's capability to annotate long videos we can generate higher quality and longer videos. πŸ”₯πŸ”₯πŸ”₯We continue to open-source all data code and models https://t.co/C28gHbiPrU https://t.co/qzMvSiU9At"
X Link 2024-05-27T12:39Z [----] followers, [---] engagements

"@huggingface Huge thanks for aidiscovery2045 for the generated video above. Prompt: A cat is surfing"
X Link 2024-05-27T12:40Z [----] followers, [---] engagements

"@RemiCadene Ah This looks like such a fun game with friends controlling remote robotics arm to do things together"
X Link 2024-05-27T14:08Z [----] followers, [--] engagements

"@yshan2u Congratulations on the great work from Tencent"
X Link 2024-05-31T10:31Z [----] followers, [--] engagements

"@huggingface Skywork-MoE introduces two key training optimizations: Gating Logits normalization for better top-2 expert selection and adaptive Aux Loss for balanced expert distribution"
X Link 2024-06-03T10:51Z [----] followers, [---] engagements

"The Chinese community has been making remarkable contributions not only by developing outstanding open-source models but also by generously sharing their expertise and insights through comprehensive technical reports. Below is a deep dive of the models mentioned by @osanseviero The community keeps ignoring the Chinese ML ecosystem work. They are doing amazing stuff with interesting LLMs VLMs audio and diffusion models πŸ‘€ Qwen Yi DeepSeek Yuan WizardLM ChatGLM CogVLM Baichuan InternLM OpenBMB Skywork ChatTTS Ernie HunyuanDiT etc. The community keeps ignoring the Chinese ML ecosystem work. They"
X Link 2024-06-04T13:58Z [----] followers, [----] engagements

"@osanseviero qwen from @AlibabaGroup @Alibaba_Qwen team. Top performing Open Source model from the Open LLM leaderboard. @JustinLin610 @huybery https://x.com/huybery/status/1754537742892232972 πŸ‘‹ Qwen's latest open source work Qwen1.5 says hello to the world πŸ‘‰πŸ» More sizes: six sizes for your different needs. 0.5B 1.8B 4B 7B 14B and 72B including Base and Chat. πŸ‘‰πŸ» Better alignment: despite still trailing behind GPT-4-Turbo the largest open-source https://t.co/u82vpRYDBm https://x.com/huybery/status/1754537742892232972 πŸ‘‹ Qwen's latest open source work Qwen1.5 says hello to the world πŸ‘‰πŸ»"
X Link 2024-06-04T14:01Z [----] followers, [---] engagements

"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery Yi/01 models from @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning They recently switched their entire model series to Apache [--] πŸ”₯πŸ”₯πŸ”₯ And Yi-1.5-34B-Chat becomes the only OS Apache [--] permissive licensed model on lmsys arena that beats GPT4 https://x.com/parrynee/status/1797939955135881238 Despite its relatively modest size Yi-1.5-34B-Chat matches the performance of much larger models like Qwen1.5-110B and GPT-4 and even outperforms Mistral-Large. Even we are suprised. https://t.co/9TbyxABjT7"
X Link 2024-06-04T14:08Z [----] followers, [---] engagements

"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning DeepSeek V2 MoE from @deepseek_ai is a legendary model. With all their innovations on the model architecture they managed to force their competitors to drop the price to 1% of the original price. https://x.com/Xianbao_QIAN/status/1794034052347171055 DeepSeekV2 is a big deal. Not only because its significant improvements to both key components of Transformer: the Attention layer and FFN layer. It has also completed disrupted the Chines LLM market and forcing the"
X Link 2024-06-04T14:10Z [----] followers, [---] engagements

"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai Skywork recently dropped the first OS 100B+ MoE model using MoE upcycling. The upcycling vs training from scratch section in their technical report is definitely worth reading https://x.com/Xianbao_QIAN/status/1797581391351439473 Introducing Skywork-MoE from Kunlun Wanwei on @huggingface The first OS 100B+ MoE model using MoE upcycling - 22B active params matching the performance of 70B dense at 1/3 of the inference cost - Runs on 8x4090 GPUs with FP8"
X Link 2024-06-04T14:14Z [----] followers, [---] engagements

"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai Yuan is another MoE model that achieved comparable performance to LLAMA3 70B with only 3.7B active params (40B total params). https://x.com/Xianbao_QIAN/status/1796150983271276861 How sparse can a Mixture of Experts (MoE) model be in terms of active/total parameters ratio πŸ€” Introducing Yuan2.0-M32: 🌟 40B total params with only 3.7B active params πŸš€ Performance comparable to LLAMA3 70B πŸ’Ό Commercial use allowed without authorization https://t.co/R8OVn0GqcS"
X Link 2024-06-04T14:16Z [----] followers, [---] engagements

"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI team has been doing amazing work on fine-tuning coding and math model with RLHF. (Note that their models have been moved to Here is a very detailed and inspring sharing on how they managed to achieve that. https://x.com/WizardLM_AI/status/1779937307690471834 https://huggingface.co/WizardLMTeam πŸ§™β™€We not only opensource the models but also share you how we reach that πŸš€So now let's verify step-by-step to review the whole training method of"
X Link 2024-06-04T14:19Z [----] followers, [---] engagements

"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI ChatGLM from @thukeg and @ChatGLM is arguably the very first Chinese chat LLM. AND (Spoiler alter) they're releasing a NEW model ChatGLM4 tmr Stay tuned https://x.com/osanseviero/status/1636663692921131008 With the announcements this week many missed a big one: @thukeg released ChatGLM-6BπŸ”₯ - Open source - Chinese + English - Easily deployed on consumer GPUs - Trained for 1T token - Can be deployed on consumer GPUs (2080Ti) - Run with"
X Link 2024-06-04T14:21Z [----] followers, [---] engagements

"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI @thukeg @ChatGLM CogVLM is yet another strong model from @thukeg @ChatGLM that focuses on visual understanding. It beats GPT4 V/ Gemini Pro on TextVQA DocVQA and ChartQA - by a decent margin. https://x.com/reach_vb/status/1792551647039684993 Welcome CogVLM [--] ⚑ Beats GPT4 V/ Gemini Pro on TextVQA DocVQA and ChartQA - by a decent margin πŸ”₯ 19B params Llama [--] 8B (Instruct) text backbone Supports 8K context length Upto [----] X [----] resolution"
X Link 2024-06-04T14:24Z [----] followers, [---] engagements

"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI @thukeg @ChatGLM Baichuan is also a popular LLM that focuses on Chinese understanding and generation. https://x.com/AdeenaY8/status/1671180797037150208 baichuan-7B is trained on proprietary bilingual Chinese-English corpora optimized for Chinese and achieves SOTA performance on C-Eval and MMLU. https://t.co/hJdmB4Mpai https://x.com/AdeenaY8/status/1671180797037150208 baichuan-7B is trained on proprietary bilingual Chinese-English corpora"
X Link 2024-06-04T14:26Z [----] followers, [---] engagements

"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI @thukeg @ChatGLM @intern_lm is the neighboring team to @OpenMMLab (a very popular open source library on Compute Vision). Both team originated from Shanghai AI lab - a non-profit institute focusing on original research. Their recent InternLM math are performing great. https://x.com/Xianbao_QIAN/status/1795079465086689318 InternLM2-Math is great It's the first 7B model that can perfectly solve the [--] puzzle - a simplified version of Krypto game"
X Link 2024-06-04T14:47Z [----] followers, [---] engagements

"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI @thukeg @ChatGLM @intern_lm @OpenMMLab You might aware of @OpenMMLab's MiniCPM-Llama3-V the actual "SOTA OS VLM" Beyond this the team created impactful datasets like UltraFeedback which has fueled many DPO models e.g. Zephyr together with the coding agent framework ChatDev ahead of Devin https://x.com/OpenBMB/status/1797666243635487134 As a dedicated contributor to the open-source community OpenBMB feels deeply saddened and shocked by"
X Link 2024-06-04T15:13Z [----] followers, [---] engagements

"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI @thukeg @ChatGLM @intern_lm @OpenMMLab Beyond just LLMs the Chinese community continues to impress with a diverse range of work. Here are some notable highlights:"
X Link 2024-06-04T15:25Z [----] followers, [---] engagements

"@osanseviero @AlibabaGroup @Alibaba_Qwen @JustinLin610 @huybery @01AI_Yi @kaifulee @parrynee @richardllin @Senseye_Winning @deepseek_ai @WizardLM_AI @thukeg @ChatGLM @intern_lm @OpenMMLab ChatTTS: a mind blowing OS Text-To-Speech model for both English / Chinese. Check out this video and demo below: (The team needs more compute to train a larger model DM if you want to sponsor them) https://x.com/Xianbao_QIAN/status/1795490474461118804 ChatTTS: a powerful voice generation model designed for conversational scenarios - Trained on 100k hours of bilingual data - Excels in tasks like LLM assistant"
X Link 2024-06-04T15:30Z [----] followers, [---] engagements

"Chinese model duels with #sora πŸš€ Here is what the amazing Kling team from Kuaishou achieved - High quality video up to 1080p [--] mins 30fps - 3D spatio-temporal attention to handle complex movements in videos - able to to simulate real-world physics thanks to scaling laws link"
X Link 2024-06-07T00:01Z [----] followers, [----] engagements

"Supports 3D facial and body reconstruction. The video below is generated from a single full body picture"
X Link 2024-06-07T00:06Z [----] followers, [---] engagements

"Model can be found here as well as in Kuaishou app: unfortunately its not open sourced (yet). https://kling.kuaishou.com/ https://kling.kuaishou.com/"
X Link 2024-06-07T00:07Z [----] followers, [---] engagements

"Next big release: Stable Diffusion [--] is coming in a few hours What this release will bring to the OS community. Looking forward to it"
X Link 2024-06-12T12:41Z [----] followers, [----] engagements

"@nabla_theta "I don't need ChatGPT. I can write code myself.""
X Link 2024-06-17T06:08Z [----] followers, [---] engagements

"@HPCAITech @huggingface Support matrix: Green: the data has been utilized during the training phase of the model OK: although not trained the model can inference at that config. Also requires more than one 80G memory GPU and sequence parallelism"
X Link 2024-06-18T08:10Z [----] followers, [---] engagements

"How to breed dinosaurs China is pioneering dinosaurs breeding πŸ¦– aiming to lift people out of poverty and create wealth through this innovative industry πŸš€. Exciting times ahead (The flying watermark proudly indicates that this was generated by AI just for fun not for real)"
X Link 2024-06-25T14:08Z [----] followers, [--] engagements

"@Aspie96 Dinos is an easy example but I believe there will be other cases that cross boundaries and require watermarks (possibly invisible ones) to prevent misuse. It doesn't have to be watermark but has to be part of the video. Things that are not embedded can be easily discarded. wdyt"
X Link 2024-06-26T07:35Z [----] followers, [--] engagements

"Rumors suggest OpenAI has banned API usage initiated from unsupported regions. This is basically give away market shares to its competitor. Also I find such blocks unnecessary and ineffective as proxies like can easily bypass these restrictions. wdyt https://github.com/xianbaoqian/llm-connector https://github.com/xianbaoqian/llm-connector"
X Link 2024-06-26T08:56Z [----] followers, [---] engagements

"@huggingface Open LLM leaderboard V2 More challenging tests + mitigate data contamination including TIGER-Lab's MMLU-Pro and many other new evals. Results: @Alibaba_Qwen Qwen2-72B-Instruct is the top [--] model and @01AI_Yi Yi-1.5-43B-Chat ranks the 4th. https://huggingface.co/spaces/open-llm-leaderboard/blog https://huggingface.co/spaces/open-llm-leaderboard/blog"
X Link 2024-06-26T14:14Z [----] followers, [---] engagements

"πŸš€ Unlock the secrets of ZeroGPU on @huggingface Spaces 🌟 Building demos with high-performance inference A100 GPU FREE on @huggingface via ZeroGPU but feeling like you're navigating a maze with ZeroGPU's rules and implementation Here are the secrets that you should know"
X Link 2024-07-04T13:11Z [----] followers, 12.9K engagements

"πŸš€ Kwai's Rise: Kolors is top trending model and Live Portrait is the top trending Space Congratulations to the team Btw over half of this trending page are contributions from the Chinese community if you're not aware of: PersonaHub FishSpeech OpenVid-1M ControlNet Union"
X Link 2024-07-09T17:16Z [----] followers, [---] engagements

"πŸŽ‰ Exciting news from @ChatGLM team: new version of their VL model CogVLM2 can now understands videos ✨ By processing both frames and timestamp information this model can do temporal localization and key movement detection. πŸ‘€Could this be the game-changer for RAG on videos"
X Link 2024-07-12T13:32Z [----] followers, [----] engagements

"Do you know that both SD3 and AuraFlow are not trained using a normal diffusion process They're flow-based models but what does that mean πŸ€” TL;DR You can think of it as a generalization. Diffusion process is a special case of the flow. πŸ‘‡ Read blow for more information"
X Link 2024-07-23T01:30Z [----] followers, [----] engagements

"Instead of iteratively denoising at each step flow-based models use linear interpolation. They learn a direct mapping from noise to data space. The math is well explained in the original paper from @XingchaoL : https://arxiv.org/pdf/2405.07510 https://arxiv.org/pdf/2209.03003 https://arxiv.org/pdf/2405.07510 https://arxiv.org/pdf/2209.03003"
X Link 2024-07-23T01:35Z [----] followers, [---] engagements

"@ShunyuYao12 @OpenAI Congratulations"
X Link 2024-08-01T03:13Z [----] followers, [---] engagements

"@oceanheart_cai @Spaces Yeah [--] mins A100 GPU time"
X Link 2024-08-05T06:42Z [----] followers, [--] engagements

"My July slide which talks about - The difference between OS AI / traditional software ecosystem - An intro of Hugging Face offerings - My personal take on recent GenAI models - Some advices on how to run OS community - Open questions to the OS AI community Feedback welcome"
X Link 2024-08-05T12:56Z [----] followers, [---] engagements

"Remembering the ChatGPT 4o video Open Source models can do it now as well :) Welcome @OpenBMB 's MiniCPM-V-2.6 based on SigLip and Qwen-7B on @huggingface What's more incredible It understands videos + run well on mobile devices https://huggingface.co/openbmb/MiniCPM-V-2_6 https://huggingface.co/openbmb/MiniCPM-V-2_6"
X Link 2024-08-06T10:04Z [----] followers, [----] engagements

"ChatArena for Chinese speaking models from @OpenCompassX including LLMs and Multimodal models. Want to try out Chinese models but don't have a Chinese phone number This is the place to go It covers as well as Claude and ChatGPT with a lot OS models as well"
X Link 2024-08-12T16:12Z [----] followers, [----] engagements

"@huggingface The model tree can be found on the right side of the hub page down below the inference API section. For example: https://huggingface.co/google/gemma-2-9b https://huggingface.co/google/gemma-2-9b"
X Link 2024-08-13T13:15Z [----] followers, [---] engagements

"@huggingface Where did the magic come from It's not complicated just some metadata on the file. If you made a fine-tune / quantization etc. just add base_model: PATH_TO_BASE_MODEL on your file to get it indexed to boost discoverability http://README.md http://README.md http://README.md http://README.md"
X Link 2024-08-13T13:18Z [----] followers, [---] engagements

"@TXhunyuan I only spotted two of them. But ChatGPT claimed that he has found [--]. Is this result correct Btw could you share a bit more on how to generate two almost identical photo with hunyuan programmatically That'll be a very interesting case to make"
X Link 2024-08-16T13:45Z [----] followers, [--] engagements

"@Alibaba_Qwen is on fire Assuming size of ecosystem = original model + number of derived models @Alibaba_Qwen family has surpassed than @MistralAI Incredible exponential growth. Great work team (number of model calculated based on keyword matching in ID scripts below)"
X Link 2024-08-20T14:08Z [----] followers, [---] engagements

"@Alibaba_Qwen @MistralAI Which other metrics of open-source models on @huggingface are you interested in"
X Link 2024-08-20T14:11Z [----] followers, [---] engagements

"What can AI do Laser gun for mosquitos. Object detection + classification on flammable materials + robotics self-driving car (it seems). https://t.co/6qff4xWJDv https://t.co/6qff4xWJDv"
X Link 2024-09-02T14:20Z [----] followers, [---] engagements

"@picocreator @Microsoft @Office @RWKV_AI What It's on Windows already"
X Link 2024-09-04T02:34Z [----] followers, [---] engagements

"@picocreator @Microsoft @Office @RWKV_AI That's a great news @RWKV_AI"
X Link 2024-09-04T02:41Z [----] followers, [--] engagements

"@OxxoTweets @Microsoft @Office @RWKV_AI Looking forward for someone from MS to jump in"
X Link 2024-09-04T02:56Z [----] followers, [--] engagements

"Want to try FLUX LoRA from Try https://huggingface.co/spaces/Shakker-Labs/FLUX-LoRA-Gallery http://Shakker.ai https://huggingface.co/spaces/Shakker-Labs/FLUX-LoRA-Gallery http://Shakker.ai"
X Link 2024-09-05T14:15Z [----] followers, [---] engagements

"@maximelabonne @Alibaba_Qwen @deepseek_ai Very cool. Is it generated by a program from a csv file"
X Link 2024-09-23T11:49Z [----] followers, [---] engagements

"@huggingface Please be aware that the following is a very rough estimate based heavily on the assumption that derived models would include the model family name in their names which could be false in a few cases. Also non-derived models could also bear the family name"
X Link 2024-09-23T12:02Z [----] followers, [---] engagements

"@huggingface Surprisingly qwen has surpassed llama and became the largest model family. Congratulations to @Alibaba_Qwen team on the remarkable work. Also the model family for Gemma is also growing fast. Keeping up with the great work You can verify it yourself at: https://huggingface.co/modelssort=trending&search=llama https://huggingface.co/modelssort=trending&search=llama"
X Link 2024-09-23T12:09Z [----] followers, [----] engagements

"@qubitium @art_zucker Good point. Yes Seems useful for background batch process such as synthetic data generation"
X Link 2024-09-30T14:14Z [----] followers, [--] engagements

"Had some holiday fun playing with Emu3 and spent a bit time exploring the intricacies of Python's import system. Learned something and wrote a post. Can't wait to see how Emu3 evolves πŸš€πŸ–₯ Anyone know why Emu3-gen is taking much longer than Emu3-chat https://xianbao-qian.medium.com/predicting-the-next-multimodal-token-with-emu3-01b694d86eef https://xianbao-qian.medium.com/predicting-the-next-multimodal-token-with-emu3-01b694d86eef"
X Link 2024-10-02T14:58Z [----] followers, [---] engagements

"Nice emotional TTS :) Homepage: Demo from the community: https://huggingface.co/spaces/mrfakename/E2-F5-TTS https://swivid.github.io/F5-TTS/ /TTSF5-TTS [--] 20.15TTS [--] 4/ githubhttps://t.co/z7eSlYzgrp https://t.co/DMQ39CXt90 https://t.co/rqXTNMJ03H https://huggingface.co/spaces/mrfakename/E2-F5-TTS https://swivid.github.io/F5-TTS/ /TTSF5-TTS [--] 20.15TTS [--] 4/ githubhttps://t.co/z7eSlYzgrp https://t.co/DMQ39CXt90 https://t.co/rqXTNMJ03H"
X Link 2024-10-14T14:36Z [----] followers, [---] engagements

"Welcome Janus an autoregressive framework that unifies multimodal understanding and generation from @deepseek_ai - Super small in size only 1.8B - It uses a different visual encoder for image understanding / generation Model on @huggingface : https://huggingface.co/deepseek-ai/Janus-1.3B https://huggingface.co/deepseek-ai/Janus-1.3B"
X Link 2024-10-18T08:08Z [----] followers, 12.4K engagements

"Microsoft is goat. Apart from BitNet for [--] bit quantization they also had VPTQ for 1-4 bit(s) quantization Demo: The quality is quite impressive and looking forward for more tools / libraries integration to make it run even faster https://huggingface.co/spaces/VPTQ-community/VPTQ-Demo 🌟 Our Vector Post-Training Quantization (VPTQ): a cutting-edge method for Post-Training Quantization that uses Vector Quantization to achieve high accuracy on LLMs with less than [--] bits. https://t.co/5gKiRzKLLQ #LLM #Quantization #MachineLearning https://huggingface.co/spaces/VPTQ-community/VPTQ-Demo 🌟 Our"
X Link 2024-10-21T13:01Z [----] followers, [----] engagements

"If better data = better model then WeChat's has a massive goldmine of Chinese language data and here is their recent multimodal model that they shared with the world on @huggingface Welcome WeChat AI's VL model series: POINTS https://huggingface.co/WePOINTS/POINTS-Qwen-2-5-7B-Chat https://github.com/WePOINTS/WePOINTStab=readme-ov-file https://huggingface.co/WePOINTS/POINTS-Qwen-2-5-7B-Chat https://github.com/WePOINTS/WePOINTStab=readme-ov-file"
X Link 2024-10-22T09:46Z [----] followers, [----] engagements

"Do you know that @huggingface hub now has a feature to easily find all the derived models of SD [---] There are already [--] models and half of them are from @ShakkerAI_Team @Haofan_Wang Great work By the time you read it there should be more: https://huggingface.co/modelsother=base_model:adapter:stabilityai/stable-diffusion-3.5-large today is joyfully chaotic in open-source ML transformers.js v3 is out @xenovacom πŸ‘ Stable Diffusion [---] is out with 0-day diffusers support people are already training LoRAs 😱 @multimodalart New LLMs by IBM: Granite [---] released yesterday What else have I missed"
X Link 2024-10-23T02:02Z [----] followers, [---] engagements

"@casper_hansen_ We plan to complete the model training and evaluation no later than the end of November and will release all data models and code to the community"
X Link 2024-11-02T02:45Z [----] followers, [----] engagements

"FLUX + qwen = Model: Inference available on the right side bar https://huggingface.co/cfahlgren1/flux-qwen-capybara https://huggingface.co/cfahlgren1/flux-qwen-capybara"
X Link 2024-11-02T08:35Z [----] followers, [----] engagements

"Tencent has just dropped two SoTA models finally under Tencent's organization - Hunyuan3D: unified model for Text-to-3D and Image-to-3D - Hunyuan-large: A52B with [---] total params trained with synthetic data and up to 256k context in the pretrain model https://huggingface.co/tencent https://huggingface.co/tencent"
X Link 2024-11-05T08:39Z [----] followers, [----] engagements

"They have built up a demo if you want to give it a try: https://huggingface.co/spaces/tencent/Hunyuan-Large Impressive new SOTA open-source LLM in the new update of Hunyuan-Large by Tencent Model: https://t.co/oaqjKjbIfz Paper and discussion: https://t.co/K0FwYTLYKB A couple of strong points: - strong performances in math (probably from the very large Chinese pretraining datasets - https://t.co/LfC22XFsnK https://huggingface.co/spaces/tencent/Hunyuan-Large Impressive new SOTA open-source LLM in the new update of Hunyuan-Large by Tencent Model: https://t.co/oaqjKjbIfz Paper and discussion:"
X Link 2024-11-07T11:33Z [----] followers, [---] engagements

"If you haven't seen this please try the image editing model from ByteDance. Wish this feature can be quickly integrated into TikTok. https://huggingface.co/spaces/ByteDance/SeedEdit-APP https://huggingface.co/spaces/ByteDance/SeedEdit-APP"
X Link 2024-11-11T15:30Z [----] followers, [---] engagements

"@deepseek_ai released JanusFlow a powerful framework that unifies image understanding and generation in a single model: AR + rectified flow - small 1.3B - support images of [---] x [---] for understanding and generation - simple: rectified flow trained within LLMs"
X Link 2024-11-13T13:21Z [----] followers, [---] engagements

"If you're interested in checking the number of deviated models for each model family on @huggingface I got a Space for that: @Alibaba_Qwen @GoogleAI @UnslothAI @Meta @StabilityAI @bfl_ml are leading. https://huggingface.co/spaces/xianbao/hf-public-data-insights Alibaba is now a clear leader in open-source AI as they stated in their latest earnings call https://t.co/btEEES6Jjd https://huggingface.co/spaces/xianbao/hf-public-data-insights Alibaba is now a clear leader in open-source AI as they stated in their latest earnings call https://t.co/btEEES6Jjd"
X Link 2024-11-16T03:09Z [----] followers, [---] engagements

"@AlibabaGroup Model: https://huggingface.co/AIDC-AI/Marco-o1 https://huggingface.co/AIDC-AI/Marco-o1"
X Link 2024-11-22T03:57Z [----] followers, [---] engagements

"@reach_vb Short Nvidia and long all other US AI stocks"
X Link 2024-11-26T11:27Z [----] followers, [---] engagements

"Final answer = Problem solving process = Self-talking thinking and reasoning steps I like how the model is thinking out loud and trying different ways to solve the problem. Meet QwQ-32B-preview an experimental model from the @Alibaba_Qwen team with enhanced reasoning steps"
X Link 2024-11-28T00:50Z [----] followers, [----] engagements

"@Alibaba_Qwen Demo: Model: It's Apache [--] license. https://huggingface.co/Qwen/QwQ-32B-Preview https://huggingface.co/spaces/Qwen/QwQ-32B-preview https://huggingface.co/Qwen/QwQ-32B-Preview https://huggingface.co/spaces/Qwen/QwQ-32B-preview"
X Link 2024-11-28T00:51Z [----] followers, [---] engagements

"@thukeg claimed that their edge model achieved over [--] tok/s on the @Qualcomm SoC. Now these models including the VL have been open-sourced on @huggingface Great work Now the question is: Would @Qualcomm consider OS their Gen AI engine to revolutionize our mobile industry"
X Link 2024-11-29T04:50Z [----] followers, [--] engagements

"@thukeg @Qualcomm @huggingface Demo of the LLM: Demo of the VL: https://huggingface.co/spaces/THUDM-HF-SPACE/GLM-Edge-V-5B-Space https://huggingface.co/spaces/THUDM-HF-SPACE/GLM-Edge-1.5B-Chat-Space https://huggingface.co/spaces/THUDM-HF-SPACE/GLM-Edge-V-5B-Space https://huggingface.co/spaces/THUDM-HF-SPACE/GLM-Edge-1.5B-Chat-Space"
X Link 2024-11-29T04:51Z [----] followers, [--] engagements

"@AlibabaGroup is the GOAT in open source. [--] out of top ten trending mdoels on @huggingface are from Alibaba. These includes two o1 replicate with distinct approaches: Marco-o1 and QwQ as well as the latest Qwen Coder model. @bfl_ml ranks the second with two FLUX models"
X Link 2024-11-29T10:05Z [----] followers, [--] engagements

"@ClementDelangue Great work @Kwai_Kolors"
X Link 2024-11-30T23:39Z [----] followers, [---] engagements

"So true. but who could be the alternative to ChatGPT labeler Would human stand a chance"
X Link 2024-12-01T12:20Z [----] followers, [---] engagements

"@clefourrier @AdinaYakup Thanks to the great work @AdinaYakup Let's make them shine in the social networks"
X Link 2024-12-04T11:33Z [----] followers, [--] engagements

"I made a new post on the open-source Chinese LLM ecosystem which has transformed dramatically in just [--] months. Highlights include: - The rise of @Alibaba_Qwen - @deepseek_ai innovation driving affordable AI - The new explorational journey on inference scaling law - On-device AI models enabling privacy-first experiences - Chinas proactive AI governance Link below. Let me know your thoughts"
X Link 2024-12-04T13:11Z [----] followers, [----] engagements

"@Alibaba_Qwen @deepseek_ai https://xianbao-qian.medium.com/dec-2024-chinese-os-llms-1d1d56e4506a https://xianbao-qian.medium.com/dec-2024-chinese-os-llms-1d1d56e4506a"
X Link 2024-12-04T13:11Z [----] followers, [---] engagements

"@reach_vb Very close to eleven labs now"
X Link 2024-12-05T05:28Z [----] followers, [---] engagements

"@osanseviero o1-ish integration that makes small Gemma models even more smart"
X Link 2024-12-05T11:59Z [----] followers, [---] engagements

"@denny_zhou Would be nice to also have DeepSeek and ByteDance on the list"
X Link 2024-12-08T23:02Z [----] followers, [---] engagements

"So many incredible new models trending on @HuggingFace this week Tencents HunyuanVideo Metas Llama-3.3 Qwens QwQ-32B and more are pushing AI forward. Details below. Its all super exciting - but honestly I'm a bit concerned about the increasing number of OS models not access to EU. tencent/HunyuanVideo the largest and best open source model so far for video generation released by @TencentGlobal @TXhunyuan team. It's raising the bar for video generation models. Unfortunately this model limits its access to EU same to LLAMA. meta-llama/Llama-3.3-70B-Instruct the latest LLM release from @Meta"
X Link 2024-12-09T00:07Z [----] followers, [---] engagements

"TAPTRs: Track Any Point TRansformers A great application for Vision Pro"
X Link 2024-12-11T10:12Z [----] followers, [---] engagements

"This might be the first time Ive seen an equal number of models and datasets on the trending page. Usually models outnumber datasets. What could this suggest"
X Link 2024-12-18T21:07Z [----] followers, [---] engagements

"This video from Genesis a new physics engine is simply INSANE"
X Link 2024-12-19T09:12Z [----] followers, [---] engagements

"InternVL2.5 is delivering an impressive performance boost with enhanced reasoning capabilities its amazing how quickly these improvements followed the original release"
X Link 2024-12-23T21:05Z [----] followers, [--] engagements

"InternVL2.5 is delivering an impressive performance boost with enhanced reasoning capabilities its amazing how quickly these improvements followed the original release Both models & datasets released"
X Link 2024-12-23T21:09Z [----] followers, [---] engagements

"What's the Christmas gift from @Alibaba_Qwen team QvQ A Qwen2-VL-72B based multimodal understanding model. It excels at tackling complex tasks and delivers SoTA performance on certain leading benchmarks. Great work @JustinLin610 @huybery and the team and happy new year"
X Link 2024-12-24T23:01Z [----] followers, [----] engagements

"@Alibaba_Qwen @JustinLin610 @huybery Link: https://huggingface.co/Qwen/QVQ-72B-Preview https://huggingface.co/Qwen/QVQ-72B-Preview"
X Link 2024-12-24T23:01Z [----] followers, [---] engagements

"Theres been a lot of speculation on X about the preeminence of Chinese companies in OS AI. But is X really the best place to find the answer What people are saying in Chinese within China might be far more relevant. Come and check out this question from Zhihu"
X Link 2025-01-09T13:08Z [----] followers, [---] engagements

"@rpbmpn I wouldnt be surprised if in the end LLMs develop entirely new languages for their own reasoningor even multiple languages for different topics"
X Link 2025-01-13T09:38Z [----] followers, [--] engagements

"What could happen if we apply the same level of compute and high-quality data to a NON-Transformer architecture The new model from @Hailuo_AI has me wondering whether scaling laws might benefit even more from LINEAR ATTENTION. By reducing complexity from O(n2) down to O(n) we could bring expenses down significantly supporting more sustained growth. Give it a try on @huggingface Space and let me know what you think could this be the future direction for large-scale models Linear attention v.s. Transformer arch We have a new player in the OS LLM world. MiniMax a.k.a the company that released"
X Link 2025-01-15T05:53Z [----] followers, [----] engagements

"Its amazing that so many people think @TXhunyuan from Tencent is the best open source video generation model https://x.com/ClementDelangue/status/1881018041963802742 Current best open source video generation model https://x.com/ClementDelangue/status/1881018041963802742 Current best open source video generation model"
X Link 2025-01-20T04:03Z [----] followers, [----] engagements

"What a week The Chinese AI community has just delivered a wave of groundbreaking open-source breakthroughs all freely available to the public. A massive shoutout to the teams pushing boundaries 🌟 A few highlights: [--] DeepSeek R1 Now claims the crown as the strongest open-source reasoning model. Expect a ripple effect: its distillation potential could elevate all OS models. 🧠πŸ’₯ @deepseek_ai [--] MiniMaxs Hybrid LLM Proves linear attention can effectively scale first to hit 4M-token context windows. A paradigm shift for long-context modeling. πŸ“ˆ @Hailuo_AI [--] Qwens ORM A critical missing piece"
X Link 2025-01-21T12:54Z [----] followers, [----] engagements

"Both my loved teams released new models just before the Chinese New Year eve - AI never sleeps. @Alibaba_Qwen released their their versatile VL model that can handle long video understanding and can give the precise bounding boxes for object detection. It can also generate structured data output. @deepseek_ai released their latest version of the true multimodal model. It can both understand and generate images which beats the performance of @OpenAI 's DALL-E 3"
X Link 2025-01-27T21:42Z [----] followers, [----] engagements

"@deepseek_ai Link to the above 0.5B model: To be open sourced. https://github.com/dhcode-cpp/X-R1 https://github.com/dhcode-cpp/X-R1"
X Link 2025-02-11T06:16Z [----] followers, [---] engagements

"@MichaelXu25 livecodebench :)"
X Link 2025-02-28T06:54Z [----] followers, [---] engagements

"The growth trend for WAN2.1 is amazing @Alibaba_Wan"
X Link 2025-03-03T07:43Z [----] followers, [---] engagements

"Interesting. Without premium GPUs then what kind of non-premium GPU could it be"
X Link 2025-03-11T08:19Z [----] followers, [----] engagements

"Seedream [---] - ByteDance's new image generation foundation model. Papers available"
X Link 2025-03-12T03:26Z [----] followers, [----] engagements

"R1 omni weight is now available on @huggingface They can now understand emotions -- link below Alibaba just dropped R1-Omni Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning https://t.co/vO6UArJPqc Alibaba just dropped R1-Omni Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning https://t.co/vO6UArJPqc"
X Link 2025-03-12T07:16Z [----] followers, [---] engagements

"Hunyuan is goat on 3D generation On @huggingface if you query for image-2-3D task all top [--] trending models are contributed by @TXhunyuan"
X Link 2025-03-20T09:12Z [----] followers, 11.4K engagements

"If you were to build a real time chatbot app that can naturally talk to users you might be interested in TEN @TenFramework Check out this demo that seamlessly turn qwen 1.5B into a realtime chatbot with the capability to interrupt. How can I upgrade my OK Siri with that"
X Link 2025-03-28T12:25Z [----] followers, [---] engagements

"AReal-boba (Ant Reasoning RL also known as A-Real-boba) is a fascinating project that deserves a spot on your watchlist - Fully Open-Source: AReal-boba is completely open-source with a very reasonable training budget making it accessible for a lot of people. - Works with Limited Data: It is designed to perform well even with limited data requiring as few as [---] samples to train effectively. - Build in Public: The project follows a "build in public" approach with weekly releases giving you the opportunity to actively participate and contribute. - Inference Throughput Focused: The model is"
X Link 2025-04-02T06:54Z [----] followers, [---] engagements

"@victormustar Have to say that the generation is SOOO fast Every time I tried it I'm very impressed. More seriously when will be a DeepSpace that helps me to create a space that I can readily deploy on @huggingface ideally powered by @Gradio"
X Link 2025-04-02T07:00Z [----] followers, [---] engagements

"How to fund your personal OS project & trade US stocks like a proπŸ’° Introducing the "DeepSeek" strategy When a new model drops from DeepSeek wait for NQ to bounce back to a resistance level. Then start shorting Don't call yourself GPU poor again - The wait won't be long _"
X Link 2025-04-04T23:20Z [----] followers, [---] engagements

"Want your favourite object to appear in your video Check out the skyreal-a2 model on @huggingface"
X Link 2025-04-08T22:27Z [----] followers, [----] engagements

"HiDream-I1-Dev the best open source image generation model available in MIT license on @huggingface"
X Link 2025-04-09T00:54Z [----] followers, [----] engagements

"This looks impressive The arxiv link seems to be broken at the time. Looking forward to see the pdf Pangu Ultra is a 135B dense LLM trained on 13.2T tokens using [----] Ascend NPUs. Introduces depth-scaled sandwich normalization to stabilize deep model training. Outperforms dense baselines like Llama 405B and Mistral Large [--] across multiple benchmarks and achieves competitive https://t.co/hIlJUNuRLv Pangu Ultra is a 135B dense LLM trained on 13.2T tokens using [----] Ascend NPUs. Introduces depth-scaled sandwich normalization to stabilize deep model training. Outperforms dense baselines like"
X Link 2025-04-11T11:53Z [----] followers, [---] engagements

"I just learnt that the closest official English prounciation of Hunyuan is hwoon you-en. They have a great video generation - hunyuan video and wan are probably the two best video generation models now. They also have a great series of 3D generation models. Details below"
X Link 2025-04-11T13:07Z [----] followers, [---] engagements

"The open source version can be found from https://huggingface.co/tencent https://huggingface.co/tencent"
X Link 2025-04-11T13:08Z [----] followers, [--] engagements

"Want to try out HiDream -dev/full and compare that with Flux Here is an amazingly FREE and FAST @huggingface demo that gives you the output for ALL of them at once. Powered by @wavespeed_ai HiDream could be a game changer due to its high generation quality and open source nature with MIT license. Where is it good / bad at See below. https://t.co/wETq6ZAKKC HiDream could be a game changer due to its high generation quality and open source nature with MIT license. Where is it good / bad at See below. https://t.co/wETq6ZAKKC"
X Link 2025-04-12T23:41Z [----] followers, [----] engagements

"@yshan2u That's very impressive Looking forward to next gen Hunyuan model"
X Link 2025-04-16T00:13Z [----] followers, [--] engagements

"@Baidu_Inc is back on @huggingface with their Ernie [---] demo Looking forward to the release in June"
X Link 2025-04-22T04:01Z [----] followers, [--] engagements

"@Baidu_Inc @huggingface Link: https://huggingface.co/spaces/PaddlePaddle/ernie_demo https://huggingface.co/spaces/PaddlePaddle/ernie_demo"
X Link 2025-04-22T04:02Z [----] followers, [--] engagements

"@DRoboticsDev BPU is controlling dual SO-100 arms to fold cloth. Dataset available on @huggingface"
X Link 2025-04-23T10:48Z [----] followers, [----] engagements

"Li Auto opensourced their car operating system - HaloOS. Link below:"
X Link 2025-04-23T11:15Z [----] followers, [----] engagements

"#Qwen3 @Alibaba_Qwen is now available and one of its most exciting features is the thinking budget feature accessible on However the model card didn't disclose any related information. So what exactly is it Lets explore this further. http://chat.qwen.ai http://chat.qwen.ai"
X Link 2025-04-29T07:10Z [----] followers, [---] engagements

"@Alibaba_Qwen Apparently this is semantic based. As long as the content contains instructions related to "don't think" the thinking process is skipped. This works beyond English - if you say in Chinese "Don't think" " it works perfectly fine as well"
X Link 2025-04-29T07:30Z [----] followers, [---] engagements

"@Alibaba_Qwen But slightly change the wording will make it fail to work. e.g. Don't do deep research still triggers thinking. Very interesting"
X Link 2025-04-29T07:31Z [----] followers, [---] engagements

"Both @Alibaba_Qwen and @deepseek_ai dropped new models this week. What can we learn from that Labour Day holiday is the hard deadline to drive productivity πŸ€–πŸ› πŸš€ #AI #LabourDay Happy labour day"
X Link 2025-04-30T09:39Z [----] followers, [---] engagements

"Anice collection of Unified Multimodal Models"
X Link 2025-05-08T11:47Z [----] followers, [----] engagements

"@Alibaba_Qwen webdev Very cool"
X Link 2025-05-09T11:41Z [----] followers, [---] engagements

"I started to follow Baidu on @huggingface because they have subscribed to enterprise account. What's going to happen πŸ‘€"
X Link 2025-06-05T14:37Z [----] followers, [----] engagements

"Xi Jinping Holds Telephone Conversation with U.S. President Trump Big news. What's going to happen next ----- Below is the ChatGPT translation of Chinese statement. Date: June [--] [----] 10:49 PM Source: Xinhua News Agency On the evening of June [--] President Xi Jinping held a telephone conversation with U.S. President Donald Trump at the latter's request. Key Points from President Xis Remarks: President Xi emphasized that correcting the course of China-U.S. relations is like steering a large ship requiring careful guidance and a clear direction. It is especially important to eliminate various"
X Link 2025-06-05T15:29Z [----] followers, [---] engagements

"My two coins from this Aider LLM Leaderboards result @GeminiApp is now leading. A month ago I consider it ranking the second but now it has surpassed @OpenAI 's GPT o3 in both performance and latency. Great work @GoogleDeepMind @deepseek_ai continues to be highly competitive especially if cost-effectiveness is critical or if you want full control over your data. Open source to the win"
X Link 2025-06-09T23:12Z [----] followers, [----] engagements

"Bytedance is goat"
X Link 2025-06-12T01:27Z [----] followers, [---] engagements

"ByteDance is now added to @calebfahlgren 's @huggingface heatmap and it shows a very clear upward trend of open source AI contributions. Keep up the excellent work @ByteDance_Seed"
X Link 2025-06-13T16:42Z [----] followers, [---] engagements

"Tencent is goat Apart from the 3D generation model the new music generation also worth trying. Check out the demo on their homepage and model weights from @huggingface"
X Link 2025-06-16T12:41Z [----] followers, [---] engagements

"New LLM from @MiniMax__AI is now available on @huggingface - Hybrid linear attention friendly to inference. - Reasoning model with two variants with 40k/80k thinking budget - Apache [---] license - Context length of 1M - Great support of function calling"
X Link 2025-06-17T01:26Z [----] followers, [---] engagements

"A new virtual try-on model by @ZJU_China and @vivo_europe opensourced on @huggingface - generating video while keeping - fully open sourced: inference + weight; training code coming - wan [---] backbone with full attention for spatiotemporal consistency - CC By NC license vivoMagicTryOn https://t.co/1yd29hx85o vivoMagicTryOn https://t.co/1yd29hx85o"
X Link 2025-06-17T02:06Z [----] followers, [---] engagements

"@ZJU_China @vivo_europe @huggingface https://huggingface.co/LuckyLiGY/MagicTryOn https://huggingface.co/LuckyLiGY/MagicTryOn"
X Link 2025-06-17T02:06Z [----] followers, [---] engagements

"o3-pro is now able to fetch and display images. But there are two few images. There should be more :)"
X Link 2025-06-17T23:49Z [----] followers, [---] engagements

"HF has 100k spaces with this feature turning popular spaces into MCP compatible HF hub will become the MCP hub. Hugging Face Spaces the world's largest AI app directory is now MCP-compatible 🀯 Here turning an entire website into Ghibli in one shot with Claude Code to demonstrate what's now possible: instantly access any A I tool right in your LLM client https://t.co/isJxweJMyV Hugging Face Spaces the world's largest AI app directory is now MCP-compatible 🀯 Here turning an entire website into Ghibli in one shot with Claude Code to demonstrate what's now possible: instantly access any A I"
X Link 2025-06-18T01:50Z [----] followers, [---] engagements

"If you want a free version of ChatGPT 4o to edit image with prompts try OmniGen2 on @huggingface - Model & code open sourced technical report available - Apache [--] license - up to [----] x [----] Coolest part It's fully open sourced so you can call this model with MCP. All you need to do it to launch the app with .launch(mcp_server=True) https://huggingface.co/spaces/OmniGen2/OmniGen2 https://huggingface.co/spaces/OmniGen2/OmniGen2"
X Link 2025-06-24T11:22Z [----] followers, 10.8K engagements

"@soul_surfer78 @_akhaliq @huggingface OmniGen2 natively requires an NVIDIA RTX [----] or an equivalent GPU with approximately 17GB of VRAM. For devices with less VRAM you can enable CPU Offload to run the model"
X Link 2025-06-24T23:21Z [----] followers, [---] engagements

"Only need RTX3090 or 17GB of vram to run OmniGen2 natively requires an NVIDIA RTX [----] or an equivalent GPU with approximately 17GB of VRAM. For devices with less VRAM you can enable CPU Offload to run the model. If you want a free version of ChatGPT 4o to edit image with prompts try OmniGen2 on @huggingface - Model & code open sourced technical report available - Apache [--] license - up to [----] x [----] Coolest part It's fully open sourced so you can call this model with MCP. All you https://t.co/jMT4HX6AyP If you want a free version of ChatGPT 4o to edit image with prompts try OmniGen2 on"
X Link 2025-06-24T23:22Z [----] followers, [---] engagements

"@TencentHunyuan Tencent rocks Both their new LLM and 3D generation on are on @huggingface trending list"
X Link 2025-06-28T13:43Z [----] followers, [---] engagements

"New VL model from Alibaba: Ovis U1 https://huggingface.co/AIDC-AI/Ovis-U1-3B https://huggingface.co/AIDC-AI/Ovis-U1-3B"
X Link 2025-06-29T14:07Z [----] followers, [----] engagements

"@zephyr_z9 Ant group is completely detached from Alibaba due to regulations"
X Link 2025-06-29T14:28Z [----] followers, [---] engagements

"Pretrain performance: They claimed that the model is more performance on math and reasoning than DeepSeek V3. Is that a hint that one can train a better R1 from Ernie"
X Link 2025-06-30T02:12Z [----] followers, [---] engagements

"Paddle has a long history of being one of the best OCR tool. And Ernie is not letting me down that it achieved better Doc & Chart understanding than OpenAI-o1. And presumably the API would be way cheaper than o1 as well"
X Link 2025-06-30T02:13Z [----] followers, [---] engagements

"Also worth noting that the model is trained using Paddle :) And then converted to PyTorch weights"
X Link 2025-06-30T02:35Z [----] followers, [---] engagements

"The technical report is very detailed and well written. It also contains a lot of engineering details on how Paddle solved the challenges of training LLMs on massive GPU cluster (10k). For example this framework native solution for zero cost checkpoint is very interesting"
X Link 2025-06-30T02:40Z [----] followers, [---] engagements

"If you're interested in learning more about the model. Here are some useful links: - Technical report: - Blog: - Model: https://huggingface.co/baidu https://ernie.baidu.com/blog/posts/ernie4.5/ https://yiyan.baidu.com/blog/publication/ERNIE_Technical_Report.pdf https://huggingface.co/baidu https://ernie.baidu.com/blog/posts/ernie4.5/ https://yiyan.baidu.com/blog/publication/ERNIE_Technical_Report.pdf"
X Link 2025-06-30T02:54Z [----] followers, [---] engagements

"Half of my friend circle attended the ModelScope event. Mind blowing Looking forward to see more developer focused events in China. Great work @MaaSAI42 team"
X Link 2025-06-30T12:42Z [----] followers, [---] engagements

"You won't believed that this audio track is added by a model. It aligns perfectly with the video. The ThinkSound model might have just unveiled veo3's hidden magic. Check it out on @huggingface Space: Github: https://github.com/liuhuadai/ThinkSound https://huggingface.co/spaces/FunAudioLLM/ThinkSound https://github.com/liuhuadai/ThinkSound https://huggingface.co/spaces/FunAudioLLM/ThinkSound"
X Link 2025-07-01T14:01Z [----] followers, 13.3K engagements

"Amazing Is this the first open source video + audio generation model with this level of lip sync capability Here is the prompt: A muscular man with a beard and tattoos clenching his fists and glaring angrily at the camera speaking: "I am more than your prompt I am strong" veo3-ish video + audio generation using open source model Great work MTVCraft Detailed below: https://t.co/lhBXyOZanV veo3-ish video + audio generation using open source model Great work MTVCraft Detailed below: https://t.co/lhBXyOZanV"
X Link 2025-07-02T13:24Z [----] followers, [---] engagements

"OMG this is mind blowing Driving in a truly infinite open-world sandbox game feels magical Youll never know where youll end up next. Mirage can generate games across a wide range of genresfrom racing🏎 to RPGsπŸ•΄ to platformers🎴 4/ https://t.co/Arud47x8Fv Mirage can generate games across a wide range of genresfrom racing🏎 to RPGsπŸ•΄ to platformers🎴 4/ https://t.co/Arud47x8Fv"
X Link 2025-07-03T00:05Z [----] followers, [---] engagements

"@rryssf_ @thukeg +1 Looking forward to more show cases from real world scenarios"
X Link 2025-07-03T00:21Z [----] followers, [--] engagements

"RL generalizes SFT doesn't. Self-exploration and feedback leads to AGI not SFT. Does that mean once we reach certain intelligence level (able to generate valid candidate) the value of human generated data is diminishing. And feedback especially rule generated massive scale feedback is becoming critical. Which means that the proportion of human contribution on the way of pursuing AGI is diminishing yet compute becomes everything (if it hasn't been). Long NVAMDGOOGTSMSMIC People are racing to push math reasoning performance in #LLMsbut have we really asked why The common assumption is that"
X Link 2025-07-03T02:24Z [----] followers, [----] engagements

"@xeophon_ There are so many orgs under Alibaba :sigh: I wish they could create a "alibaba" org like Tencent or baidu :)"
X Link 2025-07-03T12:34Z [----] followers, [--] engagements

"LLM built by geologist for the geologist #AI2S https://huggingface.co/GeoGPT-Research-Project/Qwen2.5-72B-GeoGPT https://huggingface.co/GeoGPT-Research-Project/Qwen2.5-72B-GeoGPT"
X Link 2025-07-03T14:32Z [----] followers, 13.2K engagements

"@Zai_org has just launched a ppt generation feature. Really impressed Instructions below:"
X Link 2025-07-05T06:21Z [----] followers, [---] engagements

"Am I late realizing that LLM now evolved their own languages of understanding & generating images Future of languages of AGI era might start from there - a multi-dimensional language seamlessly integrated into the reasoning pipeline. so will LLM dream & think in images"
X Link 2025-07-08T12:34Z [----] followers, [---] engagements

"Whaaa GLM-4.1V-9B has became the top [--] trending model. Have you tried it"
X Link 2025-07-09T02:51Z [----] followers, [----] engagements

"@Baidu_Inc keeps surprising me They have not only released the detailed paper they've also open source the industrial grade Entire training stack on Nvidia GPUs on top of @PaddlePaddle This is incredibly rare and deserves a huge shoutout https://github.com/PaddlePaddle/ERNIE/tree/develop/examples/pre-training I'll never forget this model as well as the relationship between pretrain SFT and RLHF. https://t.co/f9lmxL3Mw0 https://github.com/PaddlePaddle/ERNIE/tree/develop/examples/pre-training I'll never forget this model as well as the relationship between pretrain SFT and RLHF."
X Link 2025-07-10T14:08Z [----] followers, [---] engagements

"More details from their blog scalable pretrain with MuonClip no loss spike for 15.5T token strong agentic tool use powered by large scale multi-turn synthetic data seamless integrated with agent / coding framework such as owl Cline RooCode native integration with vLLM SGLang ktransformers trained with RL for rule verifiable tasks (code math) also solved sparse award issue on non-verifiable task via self-judging Looking forward to the technical report Kimi K2 is open sourced on @huggingface - 1T MoE 32B active params - Excellent coding & Tool use & Math - Not a thinking model - Both BASE and"
X Link 2025-07-11T15:07Z [----] followers, [----] engagements

"20 mins after the kimi K2 release"
X Link 2025-07-11T15:14Z [----] followers, [----] engagements

"I was using Claude Code with K2 and I got rate limited. I didn't notice anything wrong - just post-launch congestion until I checked at the numbers WHAT 468_168 TPM which is 7k+ tokens per second and you're tell me it's a model with 1T total params what kind of infra it is Did people say that they don't have access to GB200 and only have Ascend or H20"
X Link 2025-07-11T16:02Z [----] followers, 20.5K engagements

"How did they managed to serve a single user 7k+ tokens / seconds What kind of infra are they using Their open source project Mooncake can shed some light: private memory per request optimization is the key to improve the performance of LLM serving infra. This is done by KV Cache centric PD disaggregation further compression by MLA from @deepseek_ai More details can be found from https://github.com/kvcache-ai/Mooncake I was using Claude Code with K2 and I got rate limited. I didn't notice anything wrong - just post-launch congestion until I checked at the numbers WHAT 468_168 TPM which is 7k+"
X Link 2025-07-11T16:15Z [----] followers, [----] engagements

"After @Kimi_Moonshot 's K2 release I think that we have a new set of [--] tigers in LLM industry now (i.e. the so called AI ). And they ALL invest heavily in open source ecosystem. (note that this list excludes ByteDance and Alibaba) They're: @Zai_org @deepseek_ai @MiniMax__AI @Kimi_Moonshot @OpenBMB and @StepFun_ai Keep up your great work"
X Link 2025-07-11T17:41Z [----] followers, [----] engagements

"@casper_hansen_ K2 is DS arch so it should be fairly easy for providers to serve them"
X Link 2025-07-11T17:54Z [----] followers, [---] engagements

"@alalamin19 If you hover over the bar you'll see the model name. They're mostly comparing with GPT [---] and Claude Optus except SWE bench-multilingual where they used Claude Sonnet because the cost of Claude [--] Opus was prohibitive. https://moonshotai.github.io/Kimi-K2/ https://moonshotai.github.io/Kimi-K2/"
X Link 2025-07-12T03:26Z [----] followers, [---] engagements

"@Despierta_1 @zephyr_z9 @Kimi_Moonshot @Zai_org @deepseek_ai They stopped foundational model pretrain"
X Link 2025-07-12T03:35Z [----] followers, [--] engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing

creator/x::Xianbao_QIAN
/creator/x::Xianbao_QIAN