Collections
Discover the best community collections!
Collections including paper arxiv:2508.06471
-
Attention Is All You Need
Paper • 1706.03762 • Published • 122 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 10 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Paper • 2210.04186 • Published
-
Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment
Paper • 2505.10597 • Published -
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values
Paper • 2504.05535 • Published • 44 -
nvidia/HelpSteer3
Viewer • Updated • 133k • 4.77k • 109 -
nvidia/Nemotron-RL-instruction_following
Preview • Updated • 379 • 15
-
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 20 -
Evaluating Large Language Models Trained on Code
Paper • 2107.03374 • Published • 10 -
Training language models to follow instructions with human feedback
Paper • 2203.02155 • Published • 24 -
GPT-4 Technical Report
Paper • 2303.08774 • Published • 7
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 514 -
zai-org/GLM-4.6
Text Generation • 357B • Updated • 33.2k • • 1.21k -
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 3.87M • • 13.3k -
deepseek-ai/DeepSeek-V3.2-Exp
Text Generation • Updated • 189k • • 992
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 282 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 255 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 263
-
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 20 -
Evaluating Large Language Models Trained on Code
Paper • 2107.03374 • Published • 10 -
Training language models to follow instructions with human feedback
Paper • 2203.02155 • Published • 24 -
GPT-4 Technical Report
Paper • 2303.08774 • Published • 7
-
Attention Is All You Need
Paper • 1706.03762 • Published • 122 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 10 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Paper • 2210.04186 • Published
-
Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment
Paper • 2505.10597 • Published -
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values
Paper • 2504.05535 • Published • 44 -
nvidia/HelpSteer3
Viewer • Updated • 133k • 4.77k • 109 -
nvidia/Nemotron-RL-instruction_following
Preview • Updated • 379 • 15
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 514 -
zai-org/GLM-4.6
Text Generation • 357B • Updated • 33.2k • • 1.21k -
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 3.87M • • 13.3k -
deepseek-ai/DeepSeek-V3.2-Exp
Text Generation • Updated • 189k • • 992
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 282 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 255 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 263