Visualize in Weights & Biases

exceptions_exp2_last_to_push_frequency_1032

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5581
  • Accuracy: 0.3699

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 1032
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.8313 0.2912 1000 4.7483 0.2562
4.3401 0.5824 2000 4.2830 0.2995
4.1591 0.8737 3000 4.0990 0.3153
4.001 1.1648 4000 3.9891 0.3252
3.9223 1.4561 5000 3.9149 0.3321
3.8782 1.7473 6000 3.8584 0.3371
3.7327 2.0384 7000 3.8165 0.3413
3.7531 2.3297 8000 3.7841 0.3442
3.7559 2.6209 9000 3.7527 0.3471
3.7252 2.9121 10000 3.7280 0.3495
3.6288 3.2033 11000 3.7151 0.3513
3.6325 3.4945 12000 3.7007 0.3531
3.6478 3.7857 13000 3.6797 0.3551
3.5321 4.0769 14000 3.6732 0.3562
3.5763 4.3681 15000 3.6602 0.3572
3.5748 4.6593 16000 3.6472 0.3586
3.5768 4.9506 17000 3.6360 0.3594
3.5 5.2417 18000 3.6378 0.3605
3.5303 5.5329 19000 3.6282 0.3607
3.5332 5.8242 20000 3.6155 0.3617
3.4541 6.1153 21000 3.6171 0.3624
3.4769 6.4065 22000 3.6116 0.3630
3.493 6.6978 23000 3.6006 0.3637
3.503 6.9890 24000 3.5904 0.3646
3.4301 7.2802 25000 3.6006 0.3649
3.4586 7.5714 26000 3.5923 0.3653
3.463 7.8626 27000 3.5825 0.3659
3.3797 8.1538 28000 3.5888 0.3661
3.41 8.4450 29000 3.5835 0.3666
3.4295 8.7362 30000 3.5747 0.3671
3.3345 9.0274 31000 3.5795 0.3672
3.3747 9.3186 32000 3.5765 0.3675
3.4052 9.6098 33000 3.5720 0.3679
3.4182 9.9010 34000 3.5621 0.3688
3.327 10.1922 35000 3.5736 0.3682
3.3748 10.4834 36000 3.5681 0.3689
3.3862 10.7747 37000 3.5599 0.3693
3.2947 11.0658 38000 3.5684 0.3691
3.3435 11.3570 39000 3.5660 0.3692
3.3568 11.6483 40000 3.5581 0.3699
3.3558 11.9395 41000 3.5499 0.3704
3.3047 12.2306 42000 3.5656 0.3696
3.3348 12.5219 43000 3.5571 0.3703
3.3468 12.8131 44000 3.5468 0.3709
3.263 13.1043 45000 3.5615 0.3704
3.304 13.3955 46000 3.5539 0.3707
3.3326 13.6867 47000 3.5488 0.3713
3.3425 13.9779 48000 3.5420 0.3716
3.2725 14.2691 49000 3.5558 0.3710
3.2972 14.5603 50000 3.5509 0.3714
3.3232 14.8515 51000 3.5414 0.3720
3.2364 15.1427 52000 3.5593 0.3713
3.2811 15.4339 53000 3.5507 0.3716
3.3125 15.7251 54000 3.5414 0.3722
3.203 16.0163 55000 3.5496 0.3722
3.2484 16.3075 56000 3.5501 0.3721
3.2875 16.5988 57000 3.5429 0.3725
3.298 16.8900 58000 3.5375 0.3728
3.2299 17.1811 59000 3.5518 0.3722
3.2673 17.4724 60000 3.5452 0.3726
3.2797 17.7636 61000 3.5380 0.3729
3.1771 18.0547 62000 3.5497 0.3726
3.242 18.3460 63000 3.5485 0.3727
3.2572 18.6372 64000 3.5411 0.3730
3.2889 18.9284 65000 3.5364 0.3733
3.2037 19.2196 66000 3.5487 0.3727
3.2257 19.5108 67000 3.5439 0.3734
3.2564 19.8020 68000 3.5343 0.3738
3.1653 20.0932 69000 3.5531 0.3732
3.2087 20.3844 70000 3.5481 0.3732
3.2412 20.6756 71000 3.5377 0.3737
3.254 20.9669 72000 3.5304 0.3743
3.1871 21.2580 73000 3.5523 0.3732
3.2088 21.5492 74000 3.5418 0.3735
3.2292 21.8405 75000 3.5338 0.3741
3.1721 22.1316 76000 3.5507 0.3734
3.2015 22.4229 77000 3.5428 0.3739
3.2267 22.7141 78000 3.5357 0.3741
3.203 23.0052 79000 3.5419 0.3741
3.1781 23.2965 80000 3.5480 0.3736
3.2052 23.5877 81000 3.5400 0.3743
3.2213 23.8789 82000 3.5358 0.3746
3.1679 24.1701 83000 3.5492 0.3738
3.1752 24.4613 84000 3.5445 0.3738
3.2067 24.7525 85000 3.5353 0.3747
3.1235 25.0437 86000 3.5494 0.3739
3.1553 25.3349 87000 3.5456 0.3743
3.1768 25.6261 88000 3.5417 0.3745
3.2101 25.9174 89000 3.5307 0.3750
3.135 26.2085 90000 3.5507 0.3740
3.1773 26.4997 91000 3.5428 0.3746
3.1765 26.7910 92000 3.5363 0.3749

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support