ssc-kcn-mms-model-mix-adapt-max3-devtrain
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.7720
- Cer: 0.1878
- Wer: 0.4832
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 1
- eval_batch_size: 6
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 2
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Cer | Wer |
|---|---|---|---|---|---|
| 0.6953 | 0.2867 | 200 | 0.9526 | 0.2109 | 0.5338 |
| 0.7399 | 0.5735 | 400 | 0.9448 | 0.2104 | 0.5283 |
| 0.736 | 0.8602 | 600 | 0.8170 | 0.1996 | 0.5173 |
| 0.6204 | 1.1462 | 800 | 0.8343 | 0.2048 | 0.5133 |
| 0.77 | 1.4330 | 1000 | 0.9082 | 0.2060 | 0.5184 |
| 0.6475 | 1.7197 | 1200 | 0.7871 | 0.2086 | 0.5469 |
| 0.7394 | 2.0057 | 1400 | 0.8106 | 0.2012 | 0.5046 |
| 0.6703 | 2.2925 | 1600 | 0.8129 | 0.1991 | 0.4990 |
| 0.5731 | 2.5792 | 1800 | 0.7972 | 0.1989 | 0.5058 |
| 0.6649 | 2.8659 | 2000 | 0.7763 | 0.1940 | 0.4984 |
| 0.6588 | 3.1520 | 2200 | 0.8051 | 0.1953 | 0.4982 |
| 0.6042 | 3.4387 | 2400 | 0.7961 | 0.1877 | 0.4847 |
| 0.5617 | 3.7254 | 2600 | 0.7986 | 0.1898 | 0.4824 |
| 0.5678 | 4.0115 | 2800 | 0.7434 | 0.1900 | 0.4954 |
| 0.598 | 4.2982 | 3000 | 0.7335 | 0.1911 | 0.4956 |
| 0.5983 | 4.5849 | 3200 | 0.7530 | 0.1876 | 0.4861 |
| 0.5272 | 4.8717 | 3400 | 0.7720 | 0.1878 | 0.4832 |
Framework versions
- Transformers 4.52.1
- Pytorch 2.9.1+cu128
- Datasets 3.6.0
- Tokenizers 0.21.4
- Downloads last month
- 17