Built a stability adapter on top of TRL's SFTTrainer (CRMA + ZClip) — sharing ablation results

#1
by Fourwheels2512 - opened

Hi TRL community,

I've been using TRL's SFTTrainer as the backbone for a fine-tuning SaaS and wanted to share something I built on top of it: CRMA (Constrained Residual Mixing Adapter), a stability adapter that runs alongside LoRA/QLoRA.

What CRMA adds to SFTTrainer

CRMA hooks into the training loop via a custom optimizer group and per-step callback. It adds:

  • A Sinkhorn-constrained doubly stochastic mixing matrix at each transformer block
  • ZClip adaptive gradient clipping (replaces max_grad_norm=1.0 with a z-score based threshold, arXiv:2504.02507)
  • PiSSA initialization for low-rank projections (NeurIPS 2024, arXiv:2404.02948)
  • Per-step logging of baseline vs CRMA gradient norms and spectral norm

All of this runs cleanly inside SFTTrainer with a custom optimizers= tuple.

Ablation results (TinyLlama 1.1B, 200-row Alpaca, seed=42)

Metric LoRA only LoRA + CRMA Delta
Final loss 0.1658 0.1651 -0.4%
Peak grad norm 12.15 5.75 -52.7%
Mean grad norm 2.34 2.07 -11.5%
Spectral norm - 1.000000 guaranteed <= 1

Mistral-7B: plain LoRA hit a catastrophic gradient spike at step 43 (gn ~263). CRMA held it at ~3.0 — 98.9% reduction.

HF Space

https://huggingface.co/spaces/Fourwheels2512/crma-fine-tuner

Would love any feedback from TRL users, especially on cleaner ways to hook into the trainer for per-step stability metrics.

Sign up or log in to comment