Juvoly/J1-Llama-8B-exp
8B
โข
Updated
โข
4
โข
7
Thanks! Iโm trying to get it under attention as I think the leap from pretraining (10000+ hours) diffusion models to mere finetuning (10 hours) for adaptation is a big one, and could really help this method gain some traction.