Axolotl

Axolotl is a fine-tuning and post-training framework for large language models. It supports adapter-based tuning, ND-parallel distributed training, GRPO, and QAT. Through TRL, Axolotl also handles preference learning, reinforcement learning, and reward modeling workflows.

Define your training run in a YAML config file.

base_model: NousResearch/Nous-Hermes-llama-1b-v1
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

datasets:
  - path: tatsu-lab/alpaca
    type: alpaca

output_dir: ./outputs
sequence_len: 512
micro_batch_size: 1
gradient_accumulation_steps: 1
num_epochs: 1
learning_rate: 2.0e-5

Launch training with the train command.

axolotl train my_config.yml

Transformers integration

Axolotl’s ModelLoader wraps the Transformers load flow.

The model config builds from AutoConfig.from_pretrained(). Preload setup configures the device map, quantization config, and attention backend.
ModelLoader automatically selects the appropriate AutoModel class (AutoModelForCausalLM, AutoModelForImageTextToText, AutoModelForSequenceClassification) or a model-specific class from the multimodal mapping. Weights load with the selected loader’s from_pretrained. When reinit_weights is set, Axolotl uses from_config for random initialization.
Axolotl uses Transformers, PEFT, and bitsandbytes to apply adapters after model initialization when PEFT-based techniques such as LoRA and QLoRA are enabled. A patch manager applies additional optimizations before and after model loading.
AxolotlTrainer extends Trainer, adding Axolotl mixins while using the Trainer training loop and APIs.

Resources

Axolotl docs

Update on GitHub