interesting
updated
Paper
• 2309.03450
• Published
• 8
FLM-101B: An Open LLM and How to Train It with $100K Budget
Paper
• 2309.03852
• Published
• 45
Robotic Table Tennis: A Case Study into a High Speed Learning System
Paper
• 2309.03315
• Published
• 7
Improving Text Embeddings with Large Language Models
Paper
• 2401.00368
• Published
• 82
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper
• 2401.02038
• Published
• 65
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language
Modeling
Paper
• 2401.16380
• Published
• 51
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper
• 2401.15024
• Published
• 73
Can Large Language Models Understand Context?
Paper
• 2402.00858
• Published
• 24
Specialized Language Models with Cheap Inference from Limited Domain
Data
Paper
• 2402.01093
• Published
• 47
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings
and Speaks in Tokens
Paper
• 2508.05305
• Published
• 47
Who invented deep residual learning?
Paper
• 2509.24732
• Published
• 5
Rethinking the shape convention of an MLP
Paper
• 2510.01796
• Published
• 5
End-to-End Test-Time Training for Long Context
Paper
• 2512.23675
• Published
• 24