Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Open to Collab
19
8
4
Junrulu
Junrulu
Follow
ylhs001's profile picture
zen-E's profile picture
Monta3Pt's profile picture
13 followers
·
5 following
https://www.linkedin.com/in/junrulu/
LuJunru
AI & ML interests
None yet
Recent Activity
replied
to
their
post
about 6 hours ago
We are pleased to introduce a brand-new lightweight LLM, Youtu-LLM: (1) Youtu-LLM has a total of 2B parameters, employing a 32-layer dense MLA architecture and equipped with an innovative STEM- and agentic-oriented vocabulary; (2) Based on approximately 11T tokens of pre-training, particularly with native 128k long context extension and native agentic mid-training, Youtu-LLM-2B is comparable to Qwen3-4B in general and agent capabilities; (3) We have open-sourced the Base/Instruct versions, as well as the evaluation code for reproducing the test metrics. In the technical report, we share our experience of native agentic pre-training in detail. Youtu-LLM-2B is very suitable as a starting point for exploring on-device agent practice. Meanwhile, we are currently extending this paradigm to larger-scale explorations. We welcome more discussion and collaboration! 🔗 Check the project here: https://github.com/TencentCloudADP/youtu-tip/tree/master/youtu-llm 🤗 Check the models here: https://huggingface.co/collections/tencent/youtu
new
activity
about 10 hours ago
tencent/Youtu-LLM-2B-GGUF:
Update README.md
new
activity
about 10 hours ago
AaryanK/Youtu-LLM-2B-GGUF:
Thinking tags
View all activity
Organizations
Junrulu
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
published
an
article
11 months ago
view article
Article
结合Deepseek代码探讨MLA的改进及收益
Feb 20, 2025
•
7
published
an
article
11 months ago
view article
Article
大模型偏好优化技术:DPO及其变种
Feb 20, 2025
•
19