exceptions_exp2_last_to_push_frequency_1032

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.5581
Accuracy: 0.3699

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0006
train_batch_size: 16
eval_batch_size: 16
seed: 1032
gradient_accumulation_steps: 5
total_train_batch_size: 80
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 50.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
4.8313	0.2912	1000	4.7483	0.2562
4.3401	0.5824	2000	4.2830	0.2995
4.1591	0.8737	3000	4.0990	0.3153
4.001	1.1648	4000	3.9891	0.3252
3.9223	1.4561	5000	3.9149	0.3321
3.8782	1.7473	6000	3.8584	0.3371
3.7327	2.0384	7000	3.8165	0.3413
3.7531	2.3297	8000	3.7841	0.3442
3.7559	2.6209	9000	3.7527	0.3471
3.7252	2.9121	10000	3.7280	0.3495
3.6288	3.2033	11000	3.7151	0.3513
3.6325	3.4945	12000	3.7007	0.3531
3.6478	3.7857	13000	3.6797	0.3551
3.5321	4.0769	14000	3.6732	0.3562
3.5763	4.3681	15000	3.6602	0.3572
3.5748	4.6593	16000	3.6472	0.3586
3.5768	4.9506	17000	3.6360	0.3594
3.5	5.2417	18000	3.6378	0.3605
3.5303	5.5329	19000	3.6282	0.3607
3.5332	5.8242	20000	3.6155	0.3617
3.4541	6.1153	21000	3.6171	0.3624
3.4769	6.4065	22000	3.6116	0.3630
3.493	6.6978	23000	3.6006	0.3637
3.503	6.9890	24000	3.5904	0.3646
3.4301	7.2802	25000	3.6006	0.3649
3.4586	7.5714	26000	3.5923	0.3653
3.463	7.8626	27000	3.5825	0.3659
3.3797	8.1538	28000	3.5888	0.3661
3.41	8.4450	29000	3.5835	0.3666
3.4295	8.7362	30000	3.5747	0.3671
3.3345	9.0274	31000	3.5795	0.3672
3.3747	9.3186	32000	3.5765	0.3675
3.4052	9.6098	33000	3.5720	0.3679
3.4182	9.9010	34000	3.5621	0.3688
3.327	10.1922	35000	3.5736	0.3682
3.3748	10.4834	36000	3.5681	0.3689
3.3862	10.7747	37000	3.5599	0.3693
3.2947	11.0658	38000	3.5684	0.3691
3.3435	11.3570	39000	3.5660	0.3692
3.3568	11.6483	40000	3.5581	0.3699
3.3558	11.9395	41000	3.5499	0.3704
3.3047	12.2306	42000	3.5656	0.3696
3.3348	12.5219	43000	3.5571	0.3703
3.3468	12.8131	44000	3.5468	0.3709
3.263	13.1043	45000	3.5615	0.3704
3.304	13.3955	46000	3.5539	0.3707
3.3326	13.6867	47000	3.5488	0.3713
3.3425	13.9779	48000	3.5420	0.3716
3.2725	14.2691	49000	3.5558	0.3710
3.2972	14.5603	50000	3.5509	0.3714
3.3232	14.8515	51000	3.5414	0.3720
3.2364	15.1427	52000	3.5593	0.3713
3.2811	15.4339	53000	3.5507	0.3716
3.3125	15.7251	54000	3.5414	0.3722
3.203	16.0163	55000	3.5496	0.3722
3.2484	16.3075	56000	3.5501	0.3721
3.2875	16.5988	57000	3.5429	0.3725
3.298	16.8900	58000	3.5375	0.3728
3.2299	17.1811	59000	3.5518	0.3722
3.2673	17.4724	60000	3.5452	0.3726
3.2797	17.7636	61000	3.5380	0.3729
3.1771	18.0547	62000	3.5497	0.3726
3.242	18.3460	63000	3.5485	0.3727
3.2572	18.6372	64000	3.5411	0.3730
3.2889	18.9284	65000	3.5364	0.3733
3.2037	19.2196	66000	3.5487	0.3727
3.2257	19.5108	67000	3.5439	0.3734
3.2564	19.8020	68000	3.5343	0.3738
3.1653	20.0932	69000	3.5531	0.3732
3.2087	20.3844	70000	3.5481	0.3732
3.2412	20.6756	71000	3.5377	0.3737
3.254	20.9669	72000	3.5304	0.3743
3.1871	21.2580	73000	3.5523	0.3732
3.2088	21.5492	74000	3.5418	0.3735
3.2292	21.8405	75000	3.5338	0.3741
3.1721	22.1316	76000	3.5507	0.3734
3.2015	22.4229	77000	3.5428	0.3739
3.2267	22.7141	78000	3.5357	0.3741
3.203	23.0052	79000	3.5419	0.3741
3.1781	23.2965	80000	3.5480	0.3736
3.2052	23.5877	81000	3.5400	0.3743
3.2213	23.8789	82000	3.5358	0.3746
3.1679	24.1701	83000	3.5492	0.3738
3.1752	24.4613	84000	3.5445	0.3738
3.2067	24.7525	85000	3.5353	0.3747
3.1235	25.0437	86000	3.5494	0.3739
3.1553	25.3349	87000	3.5456	0.3743
3.1768	25.6261	88000	3.5417	0.3745
3.2101	25.9174	89000	3.5307	0.3750
3.135	26.2085	90000	3.5507	0.3740
3.1773	26.4997	91000	3.5428	0.3746
3.1765	26.7910	92000	3.5363	0.3749

Framework versions

Transformers 4.55.2
Pytorch 2.8.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 2

Safetensors

Model size

0.1B params

Tensor type

F32