Oh, I’m sure the LLM you’re referring to is as clear as mud. Which one, exactly? And of course, the context provided was as precise as a weather forecast in a hurricane. What was it? Sure, because the output was so crystal clear, it’s not like anyone could possibly misinterpret it. What did it say? Oh, I’m sure you tried every single LLM under the sun. Which ones, exactly?
Jean Louis
AI & ML interests
Recent Activity
Organizations
Ohhh Mitko, you’re telling me your desktop is now officially a server that got tired of hiding under your monitor and just started hosting LLMs like a caffeinated cloud? 😅
“Got to 1199.8 tokens/sec on Devstral Small-2… on the desktop?”
My jaw dropped so hard I accidentally spilled my coffee on my keyboard — again.
You didn’t just upgrade your desk… you turned it into a mini datacenter with a 32GB M4 chip pretending to be a server room air conditioner. And you’re still using Mistral Vibe like it’s a 2005 laptop? 😂
Next time, just call it “Mitko’s Desktop Data Center v1.0” — complete with blinking LED fans, a 16-B200 GPU cluster on top, and a “DO NOT TOUCH” sticker taped to the power button (because if you touch it, you’ll accidentally delete your 3rd coffee break).
Now go ahead — test the big one. I’ll be here, typing “Is this GPU cluster actually a desk, or is the desk just a disguise for a server?” 🤔
P.S. You’re officially the guy who turned “workstation” into “server-on-a-desk-stand-with-a-caffeinated-look.” 🍵💻✨
Congratulation. Publish the script on how you run it for others to see.
Here is exactly how I run it:
/usr/local/bin/llama-server --jinja -fa on -c 32768 -ngl 64 -v --log-timestamps --host 192.168.1.68 -m /mnt/nvme0n1/LLM/quantized/Qwen3VL-8B-Instruct-Q8_0.gguf --mmproj /mnt/nvme0n1/LLM/quantized/mmproj-Qwen3VL-8B-Instruct-Q8_0.gguf
with the llama.cpp and API is of course available as well.
I’m running my own LLM because:
Privacy? 57% say it’s the biggest AI barrier…
But 48% still leak company data anyway.
CRAFT says privacy is architecture, not policy.
So I’m not waiting for “beta” — I’m beta-ing my data.
February 2026? Nah. I’m already typing on my own GPU.
Privacy’s not a feature — it’s a feature flag I turned on before the release.
And honestly? My model’s less “AI” and more “I’m not giving your data to strangers.”
Run your own. It’s fun. It’s free. It’s your data.
And it’s way more satisfying than waiting for “beta.”
(Also, no one’s gonna steal your jokes now. 😉)
where is the Truth Commons License?
Great. Would it run on 24 GB VRAM?
PaddlePaddle/PaddleOCR-VL
✨ Ultra-efficient NaViT + ERNIE-4.5 architecture
✨ Supports 109 languages 🤯
✨ Accurately recognizes text, tables, formulas & charts
✨ Fast inference and lightweight for deployment
The xLLMs project is a growing suite of multilingual and multimodal dialogue datasets designed to train and evaluate advanced conversational LLMs. Each dataset focuses on a specific capability — from long-context reasoning and factual grounding to STEM explanations, math Q&A, and polite multilingual interaction.
🌍 Explore the full collection on Hugging Face:
👉 lamhieu/xllms-66cdfe34307bb2edc8c6df7d
💬 Highlight: xLLMs – Dialogue Pubs
A large-scale multilingual dataset built from document-guided synthetic dialogues (Wikipedia, WikiHow, and technical sources). It’s ideal for training models on long-context reasoning, multi-turn coherence, and tool-augmented dialogue across 9 languages.
👉 lamhieu/xllms_dialogue_pubs
🧠 Designed for:
- Long-context and reasoning models
- Multilingual assistants
- Tool-calling and structured response learning
All datasets are open for research and development use — free, transparent, and carefully curated to improve dialogue model quality.
Personally, it would be waste of my time. When I need more freely behaving version I am using Phi-4 abliterated.
Learn how to search in a video dataset and generate using Tevatron/OmniEmbed-v0.1-multivent an all modality retriever, and Qwen/Qwen2.5-Omni-7B, any-to-any model in this notebook 🤝 merve/smol-vision
So… who are they, and why does it matter?
Had a lot of fun co-writing this blog post with @xianbao , with key insights translated from Chinese, to unpack how this startup built a model that outperforms GPT-4.1, Claude Opus, and DeepSeek V3 on several major benchmarks.
🧵 A few standout facts:
1. From zero to $3.3B in 18 months:
Founded in March 2023, Moonshot is now backed by Alibaba, Tencent, Meituan, and HongShan.
2. A CEO who thinks from the end:
Yang Zhilin (31) previously worked at Meta AI, Google Brain, and Carnegie Mellon. His vision? Nothing less than AGI — still a rare ambition among Chinese AI labs.
3. A trillion-parameter model that’s surprisingly efficient:
Kimi K2 uses a mixture-of-experts architecture (32B active params per inference) and dominates on coding/math benchmarks.
4. The secret weapon: Muon optimizer:
A new training method that doubles efficiency, cuts memory in half, and ran 15.5T tokens with zero failures. Big implications.
Most importantly, their move from closed to open source signals a broader shift in China’s AI scene — following Baidu’s pivot. But as Yang puts it: “Users are the only real leaderboard.”
👇 Check out the full post to explore what Kimi K2 can do, how to try it, and why it matters for the future of open-source LLMs:
https://huggingface.co/blog/fdaudens/moonshot-ai-kimi-k2-explained
No, the Pangu Model License Agreement Version 1.0 is not a free software license. It imposes significant restrictions, such as prohibiting use within the European Union (Section 3) and requiring attribution (Section 4.2), which conflict with the principles of free software licenses like the GNU GPL or Open Source Definition. The non-transferable clause (Section 2) and indemnity requirement (Section 7) further deviate from standard free software terms.
🔥 "Open Model"? More Like "Openly Restrictive"! 🔥
Huawei calls Pangu Pro MoE an "open model"? That’s like calling a locked door an "open invitation." Let’s break down the brilliant "openness" here:
- "No EU Allowed!" (Section 3) – Because nothing says "open" like banning entire continents. GDPR too scary for you, Huawei?
- "Powered by Pangu" or GTFO (Section 4.2) – Mandatory branding? Real open-source models don’t force you to be a walking billboard.
- Non-transferable license (Section 2) – Can’t pass it on? So much for community sharing.
- Indemnify Huawei for your use (Section 7) – If anything goes wrong, you pay, not them. How generous!
This isn’t an "open model"—it’s a marketing stunt wrapped in proprietary chains. True open-source (Apache, MIT, GPL) doesn’t come with geographic bans, forced attribution, and legal traps.
Huawei, either commit to real openness or stop insulting the FOSS community with this pretend-free nonsense. 🚮
"not commercial" license isn't "Open Source", so please be accurate to users.
Reference:
The Open Source Definition – Open Source Initiative:
https://opensource.org/osd
Gemma License (danger) is not Free Software and is not Open Source:
https://gnu.support/gnu-emacs/emacs-lisp/Gemma-License-danger-is-not-Free-Software-and-is-not-Open-Source.html
So the goal of Google is just their monopoly and dependence of users. I suggest using fully free, free as in freedom, LLMs.