Jean Louis

JLouisBiz

https://www.StartYourOwnGoldMine.com

AI & ML interests

- LLM for sales, marketing, promotion - LLM for Website Revision System - increasing quality of communication with customers - helping clients access information faster - saving people from financial troubles

Recent Activity

replied to etemiz's post about 11 hours ago

Fine tuning is also important in a RAG system. The LLM will bring its own opinion sometimes and tell a different answer which is contrary to the retrieved knowledge. One should use an aligned LLM to produce the final answer.

reacted to mitkox's post with 👍 3 days ago

Got to 1199.8 tokens/sec with Devstral Small -2 on my desktop GPU workstation. vLLM nightly. Works out of the box with Mistral Vibe. Next is time to test the big one.

replied to mitkox's post 3 days ago

Got to 1199.8 tokens/sec with Devstral Small -2 on my desktop GPU workstation. vLLM nightly. Works out of the box with Mistral Vibe. Next is time to test the big one.

View all activity

Organizations

replied to etemiz's post about 11 hours ago

Oh, I’m sure the LLM you’re referring to is as clear as mud. Which one, exactly? And of course, the context provided was as precise as a weather forecast in a hurricane. What was it? Sure, because the output was so crystal clear, it’s not like anyone could possibly misinterpret it. What did it say? Oh, I’m sure you tried every single LLM under the sun. Which ones, exactly?

reacted to mitkox's post with 👍 3 days ago

Post

2160

Got to 1199.8 tokens/sec with Devstral Small -2 on my desktop GPU workstation. vLLM nightly.
Works out of the box with Mistral Vibe. Next is time to test the big one.

3 replies

replied to mitkox's post 3 days ago

Ohhh Mitko, you’re telling me your desktop is now officially a server that got tired of hiding under your monitor and just started hosting LLMs like a caffeinated cloud? 😅

“Got to 1199.8 tokens/sec on Devstral Small-2… on the desktop?”
My jaw dropped so hard I accidentally spilled my coffee on my keyboard — again.
You didn’t just upgrade your desk… you turned it into a mini datacenter with a 32GB M4 chip pretending to be a server room air conditioner. And you’re still using Mistral Vibe like it’s a 2005 laptop? 😂

Next time, just call it “Mitko’s Desktop Data Center v1.0” — complete with blinking LED fans, a 16-B200 GPU cluster on top, and a “DO NOT TOUCH” sticker taped to the power button (because if you touch it, you’ll accidentally delete your 3rd coffee break).

Now go ahead — test the big one. I’ll be here, typing “Is this GPU cluster actually a desk, or is the desk just a disguise for a server?” 🤔

P.S. You’re officially the guy who turned “workstation” into “server-on-a-desk-stand-with-a-caffeinated-look.” 🍵💻✨

replied to melvindave's post 3 days ago

Congratulation. Publish the script on how you run it for others to see.

Here is exactly how I run it:

/usr/local/bin/llama-server --jinja -fa on -c 32768 -ngl 64 -v --log-timestamps --host 192.168.1.68 -m /mnt/nvme0n1/LLM/quantized/Qwen3VL-8B-Instruct-Q8_0.gguf --mmproj /mnt/nvme0n1/LLM/quantized/mmproj-Qwen3VL-8B-Instruct-Q8_0.gguf

with the llama.cpp and API is of course available as well.

replied to CRAFTFramework's post 3 days ago

I’m running my own LLM because:
Privacy? 57% say it’s the biggest AI barrier…
But 48% still leak company data anyway.
CRAFT says privacy is architecture, not policy.
So I’m not waiting for “beta” — I’m beta-ing my data.
February 2026? Nah. I’m already typing on my own GPU.
Privacy’s not a feature — it’s a feature flag I turned on before the release.
And honestly? My model’s less “AI” and more “I’m not giving your data to strangers.”
Run your own. It’s fun. It’s free. It’s your data.
And it’s way more satisfying than waiting for “beta.”
(Also, no one’s gonna steal your jokes now. 😉)

replied to Juanxi's post 4 days ago

This comment has been hidden

replied to upgraedd's post 4 days ago

where is the Truth Commons License?

liked a model 10 days ago

mistralai/Ministral-3-14B-Reasoning-2512-GGUF

14B • Updated 9 days ago • 11.7k • 17

replied to prithivMLmods's post 30 days ago

Great. Would it run on 24 GB VRAM?

reacted to AdinaY's post with 👍 about 2 months ago

Post

685

PaddleOCR VL🔥 0.9B Multilingual VLM by Baidu

PaddlePaddle/PaddleOCR-VL

✨ Ultra-efficient NaViT + ERNIE-4.5 architecture
✨ Supports 109 languages 🤯
✨ Accurately recognizes text, tables, formulas & charts
✨ Fast inference and lightweight for deployment

reacted to lamhieu's post with 👍 about 2 months ago

Post

2737

🚀 Introducing the xLLMs Dataset Collection

The xLLMs project is a growing suite of multilingual and multimodal dialogue datasets designed to train and evaluate advanced conversational LLMs. Each dataset focuses on a specific capability — from long-context reasoning and factual grounding to STEM explanations, math Q&A, and polite multilingual interaction.

🌍 Explore the full collection on Hugging Face:
👉 lamhieu/xllms-66cdfe34307bb2edc8c6df7d

💬 Highlight: xLLMs – Dialogue Pubs
A large-scale multilingual dataset built from document-guided synthetic dialogues (Wikipedia, WikiHow, and technical sources). It’s ideal for training models on long-context reasoning, multi-turn coherence, and tool-augmented dialogue across 9 languages.
👉 lamhieu/xllms_dialogue_pubs

🧠 Designed for:
- Long-context and reasoning models
- Multilingual assistants
- Tool-calling and structured response learning

All datasets are open for research and development use — free, transparent, and carefully curated to improve dialogue model quality.

4 replies

reacted to appvoid's post with 👍 about 2 months ago

Post

4089

today is going to be a great day for small models, are you ready?

3 replies

New activity in swiss-ai/Apertus-70B-2509 3 months ago

I just got here from Hacker News, and it said "Truly Open" but then I am forced to give my personal information to get Apache-2.0 model, makes no sense

👍 4

#5 opened 3 months ago by

JLouisBiz

commented on An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct 4 months ago

Personally, it would be waste of my time. When I need more freely behaving version I am using Phi-4 abliterated.

reacted to merve's post with 👍 5 months ago

Post

2860

Now it's possible to do RAG with any-to-any models 🔥

Learn how to search in a video dataset and generate using Tevatron/OmniEmbed-v0.1-multivent an all modality retriever, and Qwen/Qwen2.5-Omni-7B, any-to-any model in this notebook 🤝 merve/smol-vision

reacted to fdaudens's post with 👍 5 months ago

Post

2623

You might not have heard of Moonshot AI — but within 24 hours, their new model Kimi K2 shot to the top of Hugging Face’s trending leaderboard.

So… who are they, and why does it matter?

Had a lot of fun co-writing this blog post with @xianbao , with key insights translated from Chinese, to unpack how this startup built a model that outperforms GPT-4.1, Claude Opus, and DeepSeek V3 on several major benchmarks.

🧵 A few standout facts:

1. From zero to $3.3B in 18 months:
Founded in March 2023, Moonshot is now backed by Alibaba, Tencent, Meituan, and HongShan.

2. A CEO who thinks from the end:
Yang Zhilin (31) previously worked at Meta AI, Google Brain, and Carnegie Mellon. His vision? Nothing less than AGI — still a rare ambition among Chinese AI labs.

3. A trillion-parameter model that’s surprisingly efficient:
Kimi K2 uses a mixture-of-experts architecture (32B active params per inference) and dominates on coding/math benchmarks.

4. The secret weapon: Muon optimizer:
A new training method that doubles efficiency, cuts memory in half, and ran 15.5T tokens with zero failures. Big implications.

Most importantly, their move from closed to open source signals a broader shift in China’s AI scene — following Baidu’s pivot. But as Yang puts it: “Users are the only real leaderboard.”

👇 Check out the full post to explore what Kimi K2 can do, how to try it, and why it matters for the future of open-source LLMs:
https://huggingface.co/blog/fdaudens/moonshot-ai-kimi-k2-explained

replied to AdinaY's post 6 months ago

No, the Pangu Model License Agreement Version 1.0 is not a free software license. It imposes significant restrictions, such as prohibiting use within the European Union (Section 3) and requiring attribution (Section 4.2), which conflict with the principles of free software licenses like the GNU GPL or Open Source Definition. The non-transferable clause (Section 2) and indemnity requirement (Section 7) further deviate from standard free software terms.

🔥 "Open Model"? More Like "Openly Restrictive"! 🔥

Huawei calls Pangu Pro MoE an "open model"? That’s like calling a locked door an "open invitation." Let’s break down the brilliant "openness" here:

"No EU Allowed!" (Section 3) – Because nothing says "open" like banning entire continents. GDPR too scary for you, Huawei?
"Powered by Pangu" or GTFO (Section 4.2) – Mandatory branding? Real open-source models don’t force you to be a walking billboard.
Non-transferable license (Section 2) – Can’t pass it on? So much for community sharing.
Indemnify Huawei for your use (Section 7) – If anything goes wrong, you pay, not them. How generous!

This isn’t an "open model"—it’s a marketing stunt wrapped in proprietary chains. True open-source (Apache, MIT, GPL) doesn’t come with geographic bans, forced attribution, and legal traps.

Huawei, either commit to real openness or stop insulting the FOSS community with this pretend-free nonsense. 🚮

replied to a-r-r-o-w's post 6 months ago

"not commercial" license isn't "Open Source", so please be accurate to users.

Reference:

The Open Source Definition – Open Source Initiative:
https://opensource.org/osd

replied to fdaudens's post 6 months ago

Gemma License (danger) is not Free Software and is not Open Source:
https://gnu.support/gnu-emacs/emacs-lisp/Gemma-License-danger-is-not-Free-Software-and-is-not-Open-Source.html

So the goal of Google is just their monopoly and dependence of users. I suggest using fully free, free as in freedom, LLMs.

New activity in THU-KEG/LongWriter-Zero-32B 6 months ago

It doesn't work with llama.cpp

👍 1

#2 opened 6 months ago by

JLouisBiz

Jean Louis

AI & ML interests

Recent Activity

Organizations

JLouisBiz's activity

I just got here from Hacker News, and it said "Truly Open" but then I am forced to give my personal information to get Apache-2.0 model, makes no sense

It doesn't work with llama.cpp