Ai-models
updated
Ultra-Long Sequence Distributed Transformer
Paper
• 2311.02382
• Published
• 6
Ziya2: Data-centric Learning is All LLMs Need
Paper
• 2311.03301
• Published
• 20
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Paper
• 2311.02103
• Published
• 20
Extending Context Window of Large Language Models via Semantic
Compression
Paper
• 2312.09571
• Published
• 16
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper
• 2401.06080
• Published
• 28
DeepSeekMoE: Towards Ultimate Expert Specialization in
Mixture-of-Experts Language Models
Paper
• 2401.06066
• Published
• 59
xDAN-AI/xDAN-L1-Chat-RL-v1
Text Generation
• 7B • Updated
• 619
• 63
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM
Workflows
Paper
• 2402.10379
• Published
• 31
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting
Paper
• 2402.13720
• Published
• 7
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper
• 2403.10704
• Published
• 60
Larimar: Large Language Models with Episodic Memory Control
Paper
• 2403.11901
• Published
• 33
Evolutionary Optimization of Model Merging Recipes
Paper
• 2403.13187
• Published
• 58
ZigMa: Zigzag Mamba Diffusion Model
Paper
• 2403.13802
• Published
• 18
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper
• 2403.13372
• Published
• 179
Can large language models explore in-context?
Paper
• 2403.15371
• Published
• 33
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Paper
• 2403.15042
• Published
• 27
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow
Instructions
Paper
• 2403.15246
• Published
• 11
Rho-1: Not All Tokens Are What You Need
Paper
• 2404.07965
• Published
• 94
Multi-Head Mixture-of-Experts
Paper
• 2404.15045
• Published
• 60
Capabilities of Gemini Models in Medicine
Paper
• 2404.18416
• Published
• 25
Many-Shot In-Context Learning in Multimodal Foundation Models
Paper
• 2405.09798
• Published
• 32
Self-Improving Robust Preference Optimization
Paper
• 2406.01660
• Published
• 20
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context
Reinforcement Learning
Paper
• 2406.08973
• Published
• 89
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for
Sparse Architectural Large Language Models
Paper
• 2407.01906
• Published
• 46
FunAudioLLM: Voice Understanding and Generation Foundation Models for
Natural Interaction Between Humans and LLMs
Paper
• 2407.04051
• Published
• 40
Finch: Prompt-guided Key-Value Cache Compression
Paper
• 2408.00167
• Published
• 17
RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented
Generation
Paper
• 2408.02545
• Published
• 40