Junrulu's picture

Open to Collab

19 8 4

Junrulu

Junrulu

tencent

·

https://www.linkedin.com/in/junrulu/

LuJunru

AI & ML interests

None yet

Recent Activity

replied to their post about 6 hours ago

We are pleased to introduce a brand-new lightweight LLM, Youtu-LLM: (1) Youtu-LLM has a total of 2B parameters, employing a 32-layer dense MLA architecture and equipped with an innovative STEM- and agentic-oriented vocabulary; (2) Based on approximately 11T tokens of pre-training, particularly with native 128k long context extension and native agentic mid-training, Youtu-LLM-2B is comparable to Qwen3-4B in general and agent capabilities; (3) We have open-sourced the Base/Instruct versions, as well as the evaluation code for reproducing the test metrics. In the technical report, we share our experience of native agentic pre-training in detail. Youtu-LLM-2B is very suitable as a starting point for exploring on-device agent practice. Meanwhile, we are currently extending this paradigm to larger-scale explorations. We welcome more discussion and collaboration! 🔗 Check the project here: https://github.com/TencentCloudADP/youtu-tip/tree/master/youtu-llm 🤗 Check the models here: https://huggingface.co/collections/tencent/youtu

new activity about 10 hours ago

tencent/Youtu-LLM-2B-GGUF:Update README.md

new activity about 10 hours ago

AaryanK/Youtu-LLM-2B-GGUF:Thinking tags

View all activity

Organizations

published an article 11 months ago

Article

结合Deepseek代码探讨MLA的改进及收益

Feb 20, 2025

•

7

published an article 11 months ago

Article

大模型偏好优化技术：DPO及其变种

Feb 20, 2025

•

19