-
Attention Is All You Need
Paper • 1706.03762 • Published • 108 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 18 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 20 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 248
Collections
Discover the best community collections!
Collections including paper arxiv:2309.05463
-
Textbooks Are All You Need
Paper • 2306.11644 • Published • 151 -
Textbooks Are All You Need II: phi-1.5 technical report
Paper • 2309.05463 • Published • 88 -
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?
Paper • 2305.07759 • Published • 38 -
Scaling Synthetic Data Creation with 1,000,000,000 Personas
Paper • 2406.20094 • Published • 104
-
Attention Is All You Need
Paper • 1706.03762 • Published • 108 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 18 -
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
Paper • 2305.13245 • Published • 6 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 248
-
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models
Paper • 2402.13064 • Published • 50 -
Textbooks Are All You Need II: phi-1.5 technical report
Paper • 2309.05463 • Published • 88 -
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
Paper • 2402.10379 • Published • 31 -
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Paper • 2312.06585 • Published • 29
-
Attention Is All You Need
Paper • 1706.03762 • Published • 108 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 18 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 20 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 248
-
Textbooks Are All You Need
Paper • 2306.11644 • Published • 151 -
Textbooks Are All You Need II: phi-1.5 technical report
Paper • 2309.05463 • Published • 88 -
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?
Paper • 2305.07759 • Published • 38 -
Scaling Synthetic Data Creation with 1,000,000,000 Personas
Paper • 2406.20094 • Published • 104
-
Attention Is All You Need
Paper • 1706.03762 • Published • 108 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 18 -
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
Paper • 2305.13245 • Published • 6 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 248
-
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models
Paper • 2402.13064 • Published • 50 -
Textbooks Are All You Need II: phi-1.5 technical report
Paper • 2309.05463 • Published • 88 -
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
Paper • 2402.10379 • Published • 31 -
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Paper • 2312.06585 • Published • 29